High-scale threat detection is challenging
When monitoring across networks and endpoints for signs of security threats such as intrusion attempts or data exfiltration, there are many opportunities for things to go wrong.
You need to decide what security data should be monitored and rank the potential data sets according to criticality. You must estimate how much data your current security infrastructure can handle and work those capabilities into your priorities.
You will need to figure out what tasks you can automate. Security teams generally run lean-and-mean, so you can’t afford to have practitioners whiling away the hours on routine tasks that could be automated. But precisely what can be automated — safely?
I experienced many of these challenges firsthand while working as a security engineer at Airbnb. When I was a practitioner, I struggled with these challenges and many more, like implementing version control over changes to our SIEM and time-consuming manual workflows.
I had the option of using the inadequate tooling existing on the market or building my own solution to address these problems. Panther was born from this struggle and takes another approach. Detection is a data problem, and we need to treat it like one. High-scale threat monitoring is a problem of getting all the data you need structured so you can then ingest it into a tool that can scale as your needs grow.
This article will discuss the top mistakes security teams make while implementing high-scale threat monitoring and how you can avoid them.
The Most Common Mistakes
Mistake 1: Onboarding too much data too quickly
The best onboarding practice is to begin with your highest-value security data from the areas of your environment that would have the highest impact if breached. Start small. Extract quality signals from this critical data, and then move onto the next set. For price, performance, and management considerations, be intentional about which data you are collecting.
Mistake 2: Not automating repetitive tasks
To operate at scale successfully, you must eliminate room for human interaction and, therefore, human error. Limited human interaction results in fewer mistakes and fewer possibilities for security breaches or insider threats. Utilize a SOAR platform to remove the repetition of everyday tasks, such as verifying user activity over Slack or enriching alerts that are very clearly false positives. Instead of dedicating manual resources to those tasks, use that time to develop better detections, proactively hunt and investigate, and build tighter security controls. Security teams are understaffed and under-resourced, so be smart with your time.
Mistake 3: Not testing code
Use unit testing frameworks and tools to ensure that your detections-as-code are accurate and will work under multiple scenarios. The best practice is to test both the positive (alert would fire) and negative (alert would not fire) use-cases to build high reliability. Another form of testing is the usage of staging environments to try out detections in a “production-like” environment prior to production. This can identify errors or noisy alerts before they land in ticketing queues or other systems that can be difficult to clean or remediate.
Mistake 4: Manual deployment and management of detections
Utilize virtual computer services coupled with CI/CD systems to deploy changes to your SIEM automatically. Avoid manual workflows unless they are specifically for the development and testing of new detections. CI/CD is repeatable and ensures the most consistent state possible between version control and production environments.
Mistake 5: Consolidating simple detections
Instead of writing a library full of simple detections that are slightly different, it’s a better idea to consolidate them into a single detection that you can easily test to ensure accuracy and correctness. The advent of programming languages for detection helps support and maintain more complex detections, especially as teams grow and the number of collaborators increases.
Conclusion
What ends up happening for many companies that need high-scale threat detection is that security teams build their own solution out of necessity. These homegrown solutions are either based on internal systems and pipelines — an augmented version of their log monitoring tool — or fully from scratch. The problem is that they cannot spend adequate time building tools – even if they have the required skills — because their primary job is to respond to breaches and keep their companies safe. The worst-case scenario is that the team member(s) who builds the critical tooling end up leaving the company, resulting in an orphaned solution that rots away and can’t get the care it needs.
In addition to avoiding the mistakes discussed here, the obvious solution to this problem is to invest in a next-generation SIEM platform, like Panther, which is purpose-built for high-scale threat monitoring.
About the Author
Jack Naglieri, Founder and CEO of Panther Labs. His exposure to information security began as an incident responder for Verisign. After graduation from George Mason University, he moved to the San Francisco Bay area and spent two years at Yahoo as an incident responder. He later transitioned into a security engineering role, with the challenge of deploying security monitoring tools at a massive scale. In 2016, he joined Airbnb, and open sourced a framework that enables real-time data analysis and alerting at scale called StreamAlert. He then managed a team of engineers further developing detection and response infrastructure at Airbnb. Now, he has formed his venture-backed startup, Panther Labs, to help companies detect and prevent security breaches in the cloud-first world.
Featured image: ©Mihail