Clickability tracking pixel

Security Data Lakes and Modern Incident Response

The below article is an edited version of the interview between Omer Singer, head of cyber security strategy for Snowflake, and Steve Towns, deputy chief content officer for e.Republic, Inc.

by Snowflake / November 25, 2020

Cloud environments create new challenges and new opportunities for security incident response teams. In this Government Technology Q&A, Omer Singer, head of cyber security strategy for Snowflake, discusses how security data lakes can improve cyber incident response in state and local government.

Why is rapid and effective incident response becoming more critical for state and local governments?

Identifying and stopping threat actors before they steal sensitive data or impact systems has always been a challenge, but as more infrastructure moves into the cloud, incident response has become even harder. We don’t have some of the protections we had in traditional environments, where critical systems and data were behind a firewall in a data center. That perimeter no longer exists. In addition, today’s high-bandwidth connections enable threat actors to pull out gigabytes or terabytes of data very quickly if they do gain access, which makes effective and efficient incident response increasingly important.

What are some shortcomings of traditional security information and event management (SIEM) systems?

Log data has traditionally lived in SIEM systems. These systems were designed before the cloud revolution, so they don’t take advantage of cloud benefits such as cost-effective storage and scalable compute. Their architecture limits how much data can be collected and how long it can be kept where security teams can actually access it. For example, an organization may not be able to collect all data sets; it may have to turn down the verbosity of high-volume data sets generated in heavily instrumented cloud environments, or it may have to leave log data siloed in individual logs. The SIEM of the past can’t be a single source of truth.

What other incident response challenges do IT and security teams face today?

There was a lot of promise around security orchestration, automation and response (SOAR) technology, where machines do some of the manual work and accelerate incident response. However, automation has been hindered by traditional search-based SIEM technologies. SIEMs make it easy to do a Google-like search for logs, but they also generate a high volume of results. Oftentimes, analysts have to parse through those results or go to additional systems to put together all the data points they need before they can decide how to respond. From an automation standpoint, that’s really held organizations back.

What is a security data lake and how does this approach differ from typical incident response?

The security data lake model is different from SIEM solutions in terms of its architecture and search orientation. First, it implies the use of cloud-native storage, which allows the cost-effective storage of petabytes of data. Second, the storage is separate from compute. Organizations don’t know in advance what security data sets will be most important. They must be able to collect all the data points into a security data lake, and then have sufficient compute power to crunch through those data sets when they need to answer a question. Third, it enables organizations to move beyond just search capabilities to true data analytics where they can write very complex logic to get the answers they need.

How does a security data lake strengthen security posture?

It allows organizations to eliminate some of the trade-offs they’ve been making. For example, because cloud-native storage is so cost-effective, they no longer have to drop data sets to make room for other data sets they need. They also no longer have to tier data into hot, warm, cold or frozen storage. That frees up the security team to focus on improving security posture. Beyond that, being able to store data for a longer period improves incident response. You can run a single query against years of data or across multiple organizations’ data. Finally, a data-driven analytics approach lets organizations define their security policy as code and then automatically apply it to all the different data sets they’ve collected. With those data sets centralized, the data analytics system can automatically identify gaps in visibility. That enables cybersecurity teams to proactively take steps and be truly prepared for an incident.

Looking for the latest gov tech news as it happens? Subscribe to GT newsletters.

E.REPUBLIC Platforms & Programs