RESOURCES

In this blog post, we explore the vital role of selective data collection in modern security operations and examine how Microsoft Azure Sentinel addresses this challenge using its powerful Data Collection Rules (DCRs). Whilst the discussion centres on Sentinel-specific examples, the insights presented are highly relevant for any organisation looking to enhance SIEM efficiency. Whether you're a security architect aiming to streamline detection or an MSSP customer focused on reducing data ingestion and operational costs, adopting a signal-driven logging strategy can yield substantial benefits. By being deliberate in what data is collected, organisations can lower overhead, sharpen threat detection, and ensure cost-effective log management.
Overview
Many SIEMs were initially deployed with good intentions — to collect as much data as possible in the hope of fine-tuning later.
But that tuning rarely happens.
That mindset still lingers in many environments. Logs are collected in bulk — every authentication event, every DNS lookup, every endpoint action — regardless of whether they support real detections, investigations, or compliance obligations.
In the cloud, data is a billing item, not a passive artifact. Ingestion costs scale with volume, not value. When teams fail to define what matters, they end up paying to collect noise — and then spend time maintaining analytics rules that sift through it.
This is the new normal:
- Cloud-first SOCs must be efficient-first
- Default logging is no longer the safe option — it's the expensive one
- High-value logging must lead every collection decision
That's where Data Collection Rules (DCRs) — Microsoft's terminology for its policy-based logging control mechanism — come in. They shift ingestion from a technical default to a deliberate strategy.
DCRs let you:
- Define what signal is essential for detection
- Enrich logs with context at the point of collection
- Route different data types to destinations that match their purpose
This could mean routing high-fidelity authentication logs into Sentinel for active correlation, whilst archiving verbose application logs in cold storage. Or sending a copy of critical asset telemetry to an MDR provider specialised in behavioural analytics.
In this new normal, logging is not a passive operation — it's a strategic control point.
With DCRs, you stop treating ingestion as a checkbox and start treating it as a reflection of your detection priorities.
What Are Data Collection Rules (DCRs)?
Data Collection Rules (DCRs) are Azure's mechanism for managing how logs are collected, filtered, transformed, and routed — before they ever reach Microsoft Sentinel. They are foundational to shaping a modern security data pipeline.
DCRs apply across agents, cloud-native services, and custom log sources. They determine what data flows where and under what conditions — not just for ingestion into Sentinel, but also for routing to archival storage or third-party systems.
What makes DCRs so valuable is that they operate before ingestion. Logs can be filtered, enriched, and shaped at the source, reducing unnecessary volume and ensuring that only relevant signal reaches your Log Analytics workspace.
For example, you might choose to collect only specific Windows Security Event IDs, or to route Application Gateway logs to storage instead of Sentinel to reduce cost.
DCRs also support basic transformations — flattening fields, dropping unused elements, or tagging logs with metadata — all of which reduce complexity for detection logic downstream.
At their core, DCRs give you control over:
- What logs are collected
- Where they are sent
- How they are shaped before arrival
Whether segmenting telemetry between production and test, or enforcing structured ingestion across your asset tiers, DCRs are no longer just a configuration option — they are the enforcement point for detection strategy, cost efficiency, and clarity.
Why Use DCRs for Sentinel?
In a modern, cloud-native SOC, collecting logs without filtering or intent is no longer sustainable. Microsoft Sentinel charges by data volume, so broad ingestion directly translates into higher operational costs — often without a corresponding gain in detection fidelity. Default collection settings can easily pull in noisy event streams or low-signal telemetry that analysts never use and detection rules never query.
Data Collection Rules (DCRs) address this by enabling teams to reverse this trend by focusing on log quality instead of log quantity. Instead of a "collect now, tune later" model, DCRs shift organisations toward selective, value-led ingestion. For example, you can scope collection to the log categories and event types that directly support your detection coverage, whilst excluding those that create cost and clutter.
In many cases, security teams operate across diverse environments — different tiers of infrastructure, different business units, or even different detection and response providers. DCRs support multi-destination routing, allowing organisations to:
- Ingest high-value telemetry into Sentinel
- Forward bulk or compliance-driven logs to low-cost storage
- Mirror critical events to an external SIEM or MDR partner for specialised analysis
This flexibility enables service segmentation, where detection responsibilities can be distributed between providers. For example, a central SOC might monitor core identities and cloud platforms, while an external MDR focuses on endpoint and threat intel fusion — each using the same telemetry, but routed according to scope.
DCRs also support enrichment at the ingestion point, which prepares logs before they ever reach Sentinel. This includes light transformations like:
- Adding metadata from tags (e.g., environment or owner labels)
- Unifying inconsistent field names for easier querying
- Removing irrelevant fields to reduce payload size
By handling this early in the pipeline, DCRs reduce the burden on analytics rules and KQL queries. Detection logic becomes leaner and more maintainable, and incidents are easier to triage when relevant context is already present in the raw data.
In short, DCRs help organisations optimise for both cost and clarity — filtering out noise, enriching signal, and giving teams the control to build smarter, more adaptable detection architectures.
Less noise. More signal. Cleaner detections. Lower cost.
How It Works
In security operations, the role of a DCR is to control the shape, volume, and destination of security-relevant data before it becomes part of your detection and response pipeline.
When a system generates a log — whether from a Windows server, Linux host, cloud service, or custom app — the Azure Monitor pipeline first checks for any Data Collection Rules (DCRs) that apply to that system or resource.
Once a DCR is defined, it's applied to a specific scope — typically a group of machines, systems, or cloud workloads that produce security-relevant logs. This lets you tune data policy based on the role or sensitivity of the system, not just its location or resource type.
If a DCR is assigned, the log data flows through three evaluation stages:
- Filtering: The DCR applies conditions such as Event IDs, log categories, or XPath expressions. If the data doesn't match, it's discarded at the edge — never ingested, never billed.
- Transformation (optional): Logs that pass the filter can be shaped before delivery. This includes operations like removing fields, renaming properties, or attaching metadata (e.g., labels from resource tags or external config).
- Routing: The shaped log is sent to one or more destinations. Common destinations include:
- Log Analytics (for Sentinel analytics and investigation)
- Azure Storage (for archival or regulatory retention)
- Event Hubs (for forwarding to third-party platforms)
This process happens in real time, before the data hits any detection logic or cost metre. The DCR acts as a policy gate at the point of collection — shaping and steering the log stream according to your design.
DCR evaluation is stateless and infrastructure-driven. Once assigned, it automatically applies to any log emitted by that resource, requiring no intervention from the host or the detection rules downstream.