Observability Data Pipeline Driven Solutions to Alleviate Analyst Alert Fatigue
Audience
Whether you’re a seasoned observability professional, a curious newcomer exploring the world of detection engineering, someone looking to operationalize MITRE ATT&CK, or even if you’ve accidentally stumbled upon this page, this is the place for you. This post will be helpful for below audience and definitely beyond that 😄
🔍 Observability Enthusiasts
👨💻 Security Analyst
🛠️ Detection Engineering Aficionado
🤖 MITRE ATT&CK Devotees
🤔 Curious Learner
😜 Link Wanderers
Usecase Briefing
In this blog, we will explore how the incorporation of an observability data pipeline platform, coupled with a resilient detection engineering approach (utilizing sigma and the MITRE CTID SMAP framework), can enhance organizational work quality and alleviate the significant operational burden of analyst alert fatigue. The methodology elucidated in this article is derived from my personal experience, a comprehensive understanding of prevailing challenges, and the analysis of publicly available statistics concerning current Security Operations Center (SOC) challenges. It is important to note that this proposed approach is not the sole method for addressing the discussed problem; rather, its implementation can vary based on individual experiences, taking on diverse forms and structures.
From Analysts Shoes
Forbes featured a survey (links for which are provided at the end of this post) conducted by Panther Labs on the state of SIEM, presenting statistics on the challenges encountered in Security Operations Centers (SOCs) with legacy SIEM systems. The 2021 and 2022 reports shed light on the noteworthy challenges faced by analysts, which are succinctly outlined below for your swift reference
Credits : Pantherlabs Survey 2021
Credits : Pantherlabs Survey 2022
After examining the survey results and drawing insights from my interactions with analysts in my network, it is evident that the primary challenges highlighted by analysts include high false positives, an overwhelming alert volume, and insufficient context. As frontline responders in any SOC, analysts should be provided with high-quality alerts enriched with significant security context. This ensures their ability to swiftly identify malicious activities, eliminating the need for prolonged, fruitless searches.
Alright, now that we have insights into what needs fixing, let’s explore where and by whom this problem can be addressed. Before that, let’s discuss the origin of the alerts. You may have already guessed it. Hooray! It’s none other than DATA.
The Source
Before delving further into the discussion about alert tuning, let’s focus on data. In general, data manifests in three primary forms, categorized as logs, metrics, and traces (a combination of events). A visual representation of these forms is provided in the image below. Credits for this simplified explanation go to Cribl, an observability data pipeline product company that will be discussed later in this post.
Credits : Cribl
So, what does the future hold for data? How might its expansion unfold as the number of internet users increases day by day? A brief exploration of this subject directed me to the Arcserve Data Attack Surface Report, offering insights into the potential volume of data we may encounter by 2030. A visual depiction of data growth from Arcserve is presented below for your quick reference
Credits : Arcserve Data Attack Surface Report
Things are getting rather unsettling now, aren’t they? We face significant challenges as analysts strive to identify true positives amidst hundreds of false positives. The projection of data growth underscores the necessity for greater control over alert quality and context enhancement. So, whom can you depend on for this, and where should you begin?
Perhaps you find yourself in a state of confusion at this point, don’t you?
Search for Detection Engineers within your office; they are the individuals entrusted with shouldering this challenge. Don’t have one? That’s okay. Read through this blog and consider devising a plan to establish one.
From A Detection Engineer’s Kitchen
For those who are new: Detection Engineering, a formidable team at the forefront of cybersecurity, is charged with crafting and refining sophisticated detection mechanisms. Tasked with identifying and neutralizing threats, they meticulously design, implement, and optimize detection rules and protocols. Their expertise lies in translating threat intelligence into actionable defense strategies, ensuring organizations stay resilient in the face of evolving cyber threats.
Let’s break down the Detection Engineering process by incorporating the SANS Detection Engineering Lifecycle.
Credits : SANS
Upon closer examination and a more profound comprehension of the framework, I’ve delineated the lifecycle into three primary segments: Intelligence, Operations, and Data Pipeline. A visual representation of this breakdown is provided below for your reference
Examining the intricate diagram above reveals that the data pipeline contributes significantly, constituting 50% of the efficiency in running a detection engineering lifecycle.
Yet, statistical insights from Panther Labs underscore that current SIEM capabilities fall short in aiding Detection Engineers and SOCs to execute tasks integral to their daily routines. Key challenges include the “complexity of adding new data”, “intricate solution structures”, “a deficit in features and functionality”, “limited product usability”, “a lack of customization options”, and “cost concerns (which will be delved into later in this post)”.
From the survery (Links at the bottom of this post):
Q: When it comes to your SIEM plans for the upcoming 12 - 24 months, what is most accurate?
Answers:
1. We are happy with our current vendor
2. We are unhappy with our current vendor and evaluating
3. We are unhappy with our current vendor and haven’t started evaluating.
Credits : Pantherlabs Survey 2022 - Happy with exisiting SIEM
Credits : Pantherlabs Survey 2022 - Unhappy with exisiting SIEM
Credits : Pantherlabs Survey 2021
This leads us to contemplate the implementation of an independent data layer preceding the final destination tool, such as a SIEM. By adopting a distinct data pipeline, we can achieve streamlined data management, facilitating the enrichment process for detection engineers to effortlessly leverage high-quality data in the creation of use cases.
KEY TAKEAWAY @ 3/4
Key insights gathered so far reveal a concerning trend.
- Analysts are grappling with alert burnout due to elevated false positive rates.
- The continuous surge in data volume, predicted to reach 200 Zettabytes by 2025, exacerbates the challenge.
- The escalating data inflow inevitably triggers more alerts and false positives, making it nearly impossible for analysts to discern true positives.
- Seeking precision, we turn to the detection engineering team; however, they too face challenges, contending with rigid products and limited customization features.
- This has led us to advocate for a separate data panel, decoupled from SIEM and analytic tools, to elevate the quality of detection engineering.
Observability
As per Cribl : Observability data pipeline gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
As per Dynatrace : In IT and cloud computing, observability data pipeline is the ability to measure a system’s current state based on the data it generates, such as logs, metrics, and traces. Observability data pipeline has become more critical in recent years as cloud-native environments have gotten more complex, and the potential root causes for a failure or anomaly have become more difficult to pinpoint.
From the preceding explanations, you may now possess a comprehensive understanding of the observability data pipeline. Let’s delve into Cribl as our observability data pipeline platform of choice. A visual representation of Cribl is provided in a high-level diagram for your reference.
Credits : Cribl
We’ve selected three significant challenges from the previous section, and we will explore how an observability data pipeline like Cribl can assist in overcoming these issues.
PROBLEM #1 - COMPLEXITY OF ADDING NEW DATA
Incorporating data sources into Cribl is remarkably straightforward. It facilitates a native log collection mechanism and boasts extensive integrations for cloud-native applications. Below links for your reference,
Cribl Sources
Cribl Destination
PROBLEM #2 - A DEFICIT IN FEATURES AND FUNCTIONALITY
The functionality of applying various operations on incoming data within Cribl is impressive. This encompasses reduction, routing, filtering, enrichment, and aggregation, providing data engineers with a seamless experience in slicing and dicing data before it reaches its destination.
Few words from Cribl :
The Cribl Pack Dispensary is the go-to place to find, install and share Cribl Packs. What are Packs? A Cribl Pack is a collection of pre-built routes, pipelines, data samples, and knowledge objects. Packs enable sharing of best-practice configurations that route, shape, reduce and enrich a given log source–Palo Alto Networks logs for example. Packs can be used with Cribl Stream and Cribl Edge.
PROBLEM #4 - COST CONCERN (WITH AN EXAMPLE)
The graph below originates from my home lab(24 hours span), featuring my pfsense firewall situated at the home perimeter, sending logs to Cribl via syslog. Utilizing parser, eval, and suppress functions (for testing purposes; in a distributed environment, a redis lookup table is required), I’ve effectively filtered out less valuable logs for security analytics. This vividly illustrates the significant reduction in both data volume and events per second (EPS) achieved in real-time as the data flows through Cribl. The numbers are just from my lab, can you imagine for your production environment ?
Conclusion
-
Based on the analysis presented in this post, it is evident that analyst burnout is a serious concern, underscoring the need to implement an optimal detection engineering process to alleviate such challenges.
-
Providing detection engineers with versatile and scalable tools is essential. These tools empower the team to choose the appropriate data for threat detection and maintain control over the ingestion pipeline to perform operations that enrich contextual information. When it comes to acting as a sword and shield for your detection engineers, there is no better choice than an Observability data pipeline, such as Cribl, to execute tasks with higher precision.
-
Leveraging frameworks like CTID’s Sensor Mappings to ATT&CK (SMAP) enhances the mindset of your detection engineers when it comes to selecting the appropriate data for analytics. Additionally, sources such as Sigma HQ and MITRE Cyber Analytics Repository (CAR) contribute to a vendor-agnostic approach in deploying use cases.
-
Relying on legacy systems that struggle to keep pace with the latest trends poses the greatest risk to an organization. Organizations embracing a more vendor-agnostic approach have a higher likelihood of operational success compared to those that do not.
Being that said, I conclude with a knowledge graph and an equation as an outcome of this post.
OBSERVABILITY PIPELINE = ADDRESSING ISSUES AT THE ORIGIN
I welcome discussions and feedback on this topic. Feel free to reach me via email “kaviarasan one one nine five at gmail dot com” or connect with me on LinkedIn.
For More Details
Cribl
Sigma
Forbes
Arcserve Report
Panther Report 2021
Panther Report 2022
CTID SMAP Framework
Pollfish Polling Method
SANS Detection Engineering
Orca Report on Alert Fatigue
Cloud Transformation Report by Splunk