Overcoming Common Causes for SIEM Solution Deployment Failures
Implementing SIEM solutions continues to be fraught with difficulties, with failed and stalled deployments common as well as solutions not meeting goals a year or more afterward. Security and risk management leaders can avoid the six most common SIEM failures by following these best practices.
- Avoiding stalled or failed SIEM solution projects requires careful planning where there is a clear understanding of the scope, objectives and associated use cases. Many security organizations underestimate the amount of planning required before purchasing, implementing and operating a SIEM solution, and hit a hard stop once this becomes clear.
- Many security organizations do not realize that enabling security logs can materially increase resource utilization (CPU and I/O) on the monitored server.
- The necessary resources for effective implementation and operations are routinely underestimated, and SIEM solutions are often purchased without assessing the feasibility of running them in-house.
To avoid the common causes for SIEM deployment failures, security and risk management leaders focused on security monitoring and operations should:
- Define clear goals, use cases and requirements including plans for how to run, administer and use the SIEM solution before selecting a SIEM solution.
- Develop an initial six- to 12-month roadmap encompassing the deployment of the SIEM solution and the phased implementation of five to seven use cases.
- Follow an output-driven model when planning SIEM tool acquisition, implementation and expansion, and use this to drive the selection of specific data sources needed for use cases.
- Start with a CLM solution, if resources are limited, to gain experience with the intricacies of implementing a SIEM tool, then gradually
- Use a co-managed SIEM service provider to rescue failed SIEM tools deployments, and to address resource or expertise constraints for new implementations.
Due to their complexity and demanding requirements, security information and event management (SIEM) solution projects often do not live up to leaders' (and user's) expectations, and failed or abandoned deployments are not uncommon. Moreover, Gartner frequently speaks to clients who are purchasing their second or third SIEM solution after finding that their incumbent solution does not meet their expectations. Lack of planning and an underappreciation of the ongoing budget, operational and infrastructure requirements, as well as underestimating the required resources and skills, are the most common causes for failed SIEM projects.
Even a successful SIEM solution deployment can be an expensive and resource-intensive proposition. Approaching a SIEM project without sufficient planning and without following a formalized process will make it even costlier, with the risk of achieving few, if any, benefits.
Gartner points out how to overcome the six most common pitfalls that Gartner encounters as root causes for failed and stalled SIEM solution deployments. Avoiding these and following the best practices that accompany them will help ensure an effective and successful deployment.
Undertake Careful Planning to Avoid Stalled or Failed SIEM Deployments
Pitfall No. 1: Failure to Perform Detailed Planning Before Buying
Despite the common perception that SIEM solutions are complex and expensive, many organizations buy one without following best practices, such as first defining goals and requirements, and then evaluating and scoping the project to determine that a given solution will meet all their requirements. The chance of successfully implementing a SIEM project without prior planning is slight; the necessary investment in time, resources and potential additional costs will far outweigh the perceived benefit of moving forward without planning. Gartner commonly encounters organizations where a SIEM solution was acquired and has been quietly gathering dust ever since the initial deployment not due to the technology being inadequate.
The majority of SIEM solutions provide solid security information management (SIM) and security event management (SEM) capabilities. However, there is wide variation in out-of-the-box, third-party integrations; integrated workflows and ancillary capabilities, such as network flow (NetFlow), network monitoring and endpoint detection and response (EDR). More importantly, without sufficient planning, correct scoping becomes guesswork.
Planning is also important to prove feasibility. In the course of planning, you may come to the conclusion that the inherent requirements of a SIEM solution mean that this is not a realistic option for your organization. SIEM tools are not suitable for every organization. They require expertise and dedicated resources; rely on a sane, well-formed operational environment; and will not compensate for shortcomings in investment, operational execution or skills. If that is the case in your organization, then there are better options available from using managed security services or managed detection and response (MDR) services to co-managed SIEM to investing in something else first for lower effort and reduced cost-risk.
Best Practice: Use a Formalized Planning Approach
Gartner recommends the following approach to planning and implementing a SIEM tool:
- Form a core project team whose primary responsibilities include defining the goals, scope and deployment phases of the project, and identifying initial stakeholders.
- Define security event monitoring objectives and the initial scope of deployment.
- Determine the initial use cases that will be covered.
- Define the data collection, retention, reporting and security event monitoring requirements.
- Assess how many and which data sources will be included to determine the required scale of the solution whether in terms of events per second, back-end storage or processing power and then check whether you can get access to those data sources.
Security and risk management leaders should then use this foundation to:
- Conduct an environmental assessment.
- Assess the architectural requirements and collection methods to determine what effort will be required to integrate data sources, if those data sources support enablement of security logs without affecting performance on workload, how many collector instances will be needed to accommodate segmented networks or geographically dispersed locations, or whether other organizational groups (such as networking or application support) should be included in the planning.
- Determine the ability of the log sources to generate the expected events. There may be limitations on some of the log sources due to performance, software versions, etc.
- Define security event monitoring and incident response (IR) processes/playbooks. Place special importance on the development of your IR processes and playbooks, as your SIEM solution will reveal the incidents.
- Create the relevant processes and policies in order to assess how many resources will be required to actually run the solution.
- Select the technology (see "Magic Quadrant for Security Information and Event Management" and "Critical Capabilities for Security Information and Event Management," as well as "Toolkit: Security Information and Event Management RFP").
- Deploy the technology (see "How to Deploy SIEM Technology").
Pitfall No. 2: Failure to Define Scope
Attempting to deploy SIEM without a predefined scope can be realistically expressed as: "no scope, no hope." The scope provides the basis for all that will follow planning, deploying, implementing and maturing the SIEM solution and related capabilities. It will determine the choice of solution, the architectural requirements, the necessary staffing, and the processes and procedures. Deployment without a defined scope and set of use cases is like building a house without a foundation. Not only is the process of building fraught with danger, but also the house will eventually crumble. For example, a SIEM deployed to monitor 10 log sources for Payment Card Industry Data Security Standards (PCI DSS) violations via reports is a very different project scope versus a global bank deploying a SIEM solution in a security operations center (SOC) over three locations, a team of 30 people and 10 terabytes of log data a day.
Attempting to define the drivers after implementation will be costly, and could lead to a failed implementation, wasting time and resources better spent on other projects. Technological debt could also be incurred due to the wrong SIEM technology selection. Gartner frequently encounters clients that have had to recover from these failed SIEM deployments.
Deploying SIEM is a marathon, not a sprint. Expect it to take up to a year to have a library of well-executed use cases and internal skills built up that allow you to effectively implement and evolve your own SIEM solution. Gartner advises security and risk management leaders to have a multistage approach that covers more than just the initial deployment, but also follow-up stages that continue to evolve additional use cases and ingest other data sources to support these use cases.
Best Practice: Define Scope and Objectives Based on Digital Business Outcomes
SIEM scope and associated objectives will be determined by the drivers for deploying SIEM and will facilitate the creation and design of use cases. Driverless SIEM will never provide a return on investment.
Alongside a variety of less common niche drivers, there are two primary drivers for SIEM: compliance and threat management. Although compliance is still a possible driver and key requirement for SIEM tool implementations, the focus has shifted for some organizations toward threat management (see Figure 1).
Figure 1. SIEM Implementation Drivers
Source: Gartner (May 2017)
Threat Management Threat management scope can span many different objectives and use cases (for example, detecting brute force and malware attacks, or monitoring third-party or privileged users for any kind of administrative policy changes). Depending on the specific objective, the scope will typically include network security devices, such as firewalls, intrusion detection and prevention systems (IDPS), secure web gateways (SWGs), web application firewalls (WAFs), and security applications like vulnerability assessment and data loss prevention (DLP), as well as various security, audit and application logs from systems or database servers.
Examples for threat management use cases include real-time monitoring for external network threats, authentication failures, threat intelligence feed fusion, and user activity and behavior profiling and monitoring. These will be accompanied by incident response processes and automated cross-data correlation and alerting, among others.
The addition of threat intelligence has added significantly to a SIEM solution's ability to ingest external threat context and overlay it over existing data. It is highly recommended that all SIEM deployments at a minimum look to enable the vendor-provided threat intelligence add-on of feed. In addition, you should look to use other sources, often via the open standards, like STIX/TAXII. For more advanced use cases, a threat intelligence platform (TIP) can be used to do the collection, curation and deduplication of large amounts of threat intelligence, which can then be deposited into the SIEM tool via native integrations.
Compliance When compliance is a driver, the scope will be determined by the compliance requirements and regulatory mandates, such as the PCI DSS Requirement 10 and the Sarbanes-Oxley Act (SOX) of 2002. Also, the General Data Protection Regulation (GDPR) is expected to quickly become another compliance mandate that drives investment in monitoring technologies like SIEM. Typical mandates include log management or user activity monitoring by regular inspection of reports and monitoring for policy violations. The main objective is to comply with regulations, which drives security monitoring; however, this does not dictate that the organization should monitor specific infrastructure and applications.
Basic use cases include the collection and retention of log data, as specified by a regulatory mandate, and the occasional generation and review of user activity reports, with advanced use cases ranging from the automated auditing of policy violations to daily log reviews. SIEM solutions that have good coverage of compliance use cases can bundle compliance reports that are already created with the express purpose of meeting this compliance regime. Security and risk management leaders are strongly advised to investigate these options when the topic of using SIEM for compliance use cases is needed.
Niche There are also a number of niche use cases with their own scope for example, retail point of sale (POS) monitoring, operational technology (OT) monitoring or honeypot monitoring. "Honeypot" is a term commonly referred to as a deception technique, and has been utilized with a SIEM solution to expand security event monitoring capabilities.
Pitfall No. 3: Overly Optimistic Scoping
To obtain the maximum value out of a SIEM tool, it may seem tempting to do everything with it at once. SIEM solutions can ingest and process large amounts of event data and provide capabilities to manage these accordingly. Critically, however, the only effective way to scale the SIEM tool to be an effective platform for organizationwide security event monitoring and incident management is to do so in stages. Every use case has distinct required data sources and subsequent correlation rules, alerts, reports and dashboards. SIEM use cases should be used in a way to set up stages of cycles so that the organization is "chipping away" at constant improvements, rather than a "boil the ocean" approach where it is at a high risk of not getting adequate value from the SIEM tool.
Attempting to throw everything at the SIEM solution at once and hoping to be able to clean up later is one of the most common causes for stalled SIEM deployments. Sending a SIEM tool too many logs is just as bad as not sending enough. Also, some organizations fall into a "collection first, usage never" trap. Collecting everything takes years, and by that time, they are displeased with their SIEM solution. In many cases, an environmental assessment of that scale is a daunting and costly task that inevitably involves many different organizational units and stakeholders. Architectural and operational changes that may result from the use-case requirements will be complex and difficult to implement across the organization.
Best Practice: Construct an Initial Roadmap of Five to Seven Use Cases
Gartner recommends that security and risk management leaders who are planning a SIEM implementation should identify between five and seven use cases that will be used to construct an initial roadmap. A realistic time frame for this sort of implementation is six to 12 months, depending on use-case complexity, the IT security team's experience, and whether additional technologies or stakeholders are required. Then, further use cases can be added going forward. Once the process has become sufficiently formalized and practiced, many organizations find their ability to implement new use cases becomes more efficient and effective.
It is, however, not uncommon to have a combination of compliance, threat management or niche drivers. In this situation, the use cases should be prioritized according to business needs. Advanced users can also attempt to logically group use cases according to shared requirements, such as needing the same data sources.
Common use cases include authentication tracking and compromised account detection, tracking compromised and infected systems, malware detection by using network connectivity logs to identify outbound connections, and tracking system changes and other administrative actions.
It is also important to note that, depending on the scope, organizations need to verify the availability of the required data sources to validate feasibility of the use cases. With everything else going on within an organization, it is easy to forget to check if it is possible to physically generate logs from the servers, security appliances, etc. For example, an organization may want to monitor logs from its database server from a security perspective. In order to do this, it would need to enable certain types of auditing logs from the database management system (DBMS), which, in turn, would increase the CPU usage. Some organizations have their database server working close to its capacity, and enabling auditing logs would create performance issues. If that is the case, adding a database firewall to generate the auditing logs will not compromise the performance, nor would it require the need to upgrade the server.
If it is the first time implementing a SIEM solution in your organization, do not buy excessive quantities of capacity (e.g., storage and computing power). It is much better to start with a realistic (more conservative) approach, and allow the architecture to grow into the SIEM tool. Growing the architecture includes various components of the entire SIEM system (for example, the back-end storage and correlation/analysis systems). An organization can grow the system by replacing hardware with bigger hardware; however, most deployments grow by becoming more distributed, adding more components or separating functions (such as correlation and storage) into distinct systems. For a profound description of SIEM expansion, see "Security Information and Event Management Architecture and Operational Process."
Pitfall No. 4: Monitoring Noise
SIEM is not log collection, where the goal is to capture and store all logs from all devices and applications without discrimination. Yet a common mistake is to approach it this way, thinking it will be easy to make sense of all of this data once it is in the SIEM system. The predictable result is that what should be an exercise in reducing noise actually amplifies it and generates more of it. Finding a needle in the haystack does not benefit from increasing the amount of hay.
In addition, beyond out-of-the-box correlation rules, reports and dashboards that cover common basic use cases, SIEM tools have to be configured to look for and recognize the activity or events you may be seeking. It is not a magic bullet throwing data at it and hoping it will automatically illuminate every security problem in your environment will yield nothing but disappointment and disillusionment. Certain anomaly detection approaches such as statistical analysis, deviation from baseline and outlier identification can potentially benefit from larger datasets, but even then the specific type of data is relevant. Despite some basic overlap in functionality and approach, SIEM tools are not big data analytics platforms.
Best Practice: Employ Output-Driven SIEM
To facilitate the targeted and focused collection and analysis of only relevant events and data, SIEM should be output-driven, which means that event data is captured only if it is required for a predetermined output or result. Log and event sources must be admitted based only on specific use cases, correlation rules, reports and dashboards. For example, for the typical use case of monitoring for suspicious outbound connectivity and data transfers, firewall and web proxy logs and network flow data are required. The use case would not, however, benefit from the inclusion of web application access or DHCP logs.
Gartner recommends that a central log management (CLM) solution should be implemented first in front of the SIEM solution. CLM is a critical IT capability that has value for all parts of IT operations, including security. If an organization has CLM, this will allow the controlled selective filtering and forwarding of relevant and in-scope event data to the SIEM tool, permitting an output-driven approach. In addition, CLM allows for later forensic usage or to fulfill nonsecurity requirements, which can be done on the log collection and management tier without polluting the SIEM tool. Another advantage to having a log collection and management tier is that it significantly reduces the time to search a collection of logs from days or weeks to minutes or hours should an organization need to perform this task during an incident response activity. Without it, the task could become a multiweek effort, as logs are manually reviewed by administrators and analysts, and manual correlation of events is attempted. During this time, attackers might still be operating inside the IT environment of the organization and/or could be long gone with the stolen critical information.
Output-driven SIEM requires careful forward planning, know-how around real-world threats (or regulatory requirements and controls), and expertise around the processes and technologies to be monitored. Often this requires the involvement of stakeholders outside of the security organizations. However, a case can be made that these are general requirements for an effective SIEM deployment.
This investment in planning will yield a far more effective SIEM deployment compared to doing this haphazardly. Output-driven SIEM has several benefits and advantages beyond preventing the SIEM tool from being inundated with useless data.
More-focused use cases with selective data ingestion reduce the amount of personnel required to watch and manage the SIEM tool and also minimize the required data collection. This allows for more cost-effective scaling of the SIEM solution.If you do not know what you are looking for, SIEM will not provide a lot of additional value especially for the potential costs involved.
On the other hand, there are user and entity behavior analytics (UEBA) tools that can also be integrated with the SIEM solution or from the provider natively to pinpoint threats and improve threat detection capabilities across multiple monitoring systems or other information sources that feed into their platforms. Since UEBA products typically need a data source, and SIEM products are commonly the central aggregation point for security logs for an organization, the tools complement each other. It can almost be a two-way street in the sense that the SIEM forwards applicable log events to the UEBA for user profiling, while the UEBA tool generates alerts that can be sent into the SIEM tool for enrichment with events from other sources, in turn presenting the alert to a security analyst for triage. Also, a SIEM solution in place with the necessary data makes the UEBA deployment far easier, as data collection becomes straightforward. However, Gartner has recently seen some of the UEBA vendors building their own SIEM solutions and competing in that market as well.
Pitfall No. 5: Lack of Sufficient Context
Monitoring an intrusion detection and prevention system (IDPS) via SIEM will add some powerful capabilities: It will correlate the operating system, user logins and network telemetry from the IDPS to significantly enhance behavior monitoring and incident response activities. However, without further context and data sources for correlation, such as access to the application server logs to verify if the attack seen on the IDPS has been successful, the SIEM will not be utilized to its full potential. SIEM does not see anything that you do not provide.
Best Practice: Follow a Formal Use-Case Implementation Process
Generally, these pitfalls can be avoided by following a formal process for use-case implementation that includes the following (see "How to Develop and Maintain Security Monitoring Use Cases" for more detailed information on security monitoring use-case implementation):
- Use-case selection Determine and select the initial use case(s), which identify what you are trying to monitor or achieve.
- Data collection needed Identify the scope of required data.
- Log source configuration needed Configure necessary log and data sources to send the data to, or fetch it from, the SIEM system.
- SIEM content creation, preparation and selection Construct the SIEM content (correlation rules, reports, dashboards, etc.).
- Definition of operational processes required Define the operational processes to manage this specific use case if required.
- Test the use case Generate the anticipated behavior or activity, and verify that everything is configured correctly (see Note 1).
- Refine the content and processes loop Remediate and refine any issues and SIEM content.
Pitfall No. 6: Insufficient Resources
A successful SIEM deployment requires skilled people. Once you begin looking, you will inevitably find things; and these findings will require a response. Additionally, a SIEM solution does not run itself. At a bare minimum, it requires ongoing tuning and maintenance to reflect changes in the environment, "threatscapes," compliance mandates or the gathered data itself. There are three main duties associated with the operation of SIEM:
- Run This entails managing and maintaining the underlying infrastructure for the SIEM, ensuring that patches have been applied, that there is sufficient storage, or that users are added or deleted. Typically, this is an engineering task, especially when following segregation of duties per ITIL, for example.
- Watch This encompasses real-time event monitoring of alerts and events, and responding, investigating and escalating incidents. A typical role title would be security analyst.
- Tune This aspect focuses on the ongoing optimization and tuning of correlation rules, reports and signatures, and even processes involved with the watch duties described above. Often this is done by a senior security analyst or third line.
For most SIEM deployments, the above duties are needed, and require time and a different skill set, to be performed on a continual basis (see Figure 2).
Figure 2. SIEM Duties
From "Security Information and Event Management Architecture and Operational Processes"
Source: Gartner (May 2017)
While some airplane mechanics may be able to fly a plane, you would probably prefer a fighter pilot to fly it into combat if the need ever arose. The same is the case with SIEM. The maintenance and administration of the architecture remains under the engineering umbrella and will feel familiar to anyone managing other types of systems or network infrastructure. However, using SIEM making sense of the data that is coming in, analyzing and responding to incidents, and constructing use cases and correlations rules requires a security analyst skill set. Many organizations state that there are too many unfulfilled security-related positions, and not enough security analysts, within their organizations; however, concurrently, the salaries for these security professionals have increased substantially.
Not only must there be sufficient knowledgeable staff to manage and maintain the SIEM, but they must also be prepared for the additional work that results from the SIEM. Incidents must be investigated and issues remediated, and these tasks are seldom the responsibility of the security organization; they require other departments and teams.
For a typical midsize bank, a minimum staff of eight to 10 is required to run a dedicated 24/7 security event monitoring operation, with two analysts per shift (not necessarily dedicated full-time equivalents) working three days and having four days off, reversing the week after. In total, there are four 12-hour shifts per week, without taking into consideration vacation, sickness or staff turnover. In addition, this does not include any managers, and it also does not account for how much work is actually there. Rather, it is the minimum to allow real-time, 24/7 monitoring.
Best Practice: Limit the Scope and Engage an External Service Provider
SIEM is not a deploy-and-forget technology, and it does not run itself. Instead, SIEM is a force multiplier. It will allow certain tasks and use cases to be done more efficiently and effectively, but it will not run by itself. There are, however, a few strategies that can be employed to operate SIEM with a low staff contingent:
- Limit the scope Adapting the scale of what is being monitored to align with the available resources is a viable option. This applies to the number of monitored data sources, as well as the scope of the use cases. Use cases should be concise and focused so they can be automated via correlation rules as much as possible, and there should be a realistic estimate of how many use cases in total can be managed by the available staff.
Successfully implementing at least limited use cases can provide risk reduction where it is most needed and can also be used to build a business case for expanding the monitoring scope.
Engage an external security service provider Consider an external service provider if you face one or more of the following concerns within your organization due to a lack of internal resources and expertise: managing a SIEM deployment, performing real-time alert monitoring and expanding the deployment to include new use cases. Gartner sees an interest in services to support existing or planned SIEM deployments.
Co-managed SIEM service providers remotely manage, operate and use the customer's own SIEM solution. Basic services include management, configuration, rule- or report-writing, and tuning of the SIEM tool. They also offer more-comprehensive capabilities, such as 24/7 security event monitoring and alerting, as well as investigation of security incidents detected via the SIEM tool.
Managed security service providers (MSSPs) offer real-time monitoring and analysis of security events, and provide log collection via their own SIEM solution for reporting and investigation purposes.
MDR service providers are an alternative to deploying a SIEM solution, and do not fit the traditional managed security service model. These services typically monitor specific elements of the customer environment and use advanced analytics techniques to identify threats. MDR providers focus primarily on threat detection use cases, and rarely on compliance use cases. MDR vendors may provide their own technology to the customer's environment, and delivered as-a-service. There is no need for the customer to purchase a commercial SIEM, as the security functions are delivered via shared services from the MDR service provider's remote SOC.
Another viable option is on-site staff augmentation (see Note 2), which is also offered by many providers; however, it can be prohibitively expensive, with prices ranging from $150,000 to $350,000 per year per analyst. There are also security consulting services available, but their prices depend on the services provided. For instance, general consulting can range from $170 to $200 per hour, and increasing for incident response or digital forensics, which can range between $300 and $450 per hour.1
1 Based on pricing reviewed by Gartner during client inquiries.
While testing may seem daunting to some, without it, it is impossible to know whether what you intend on monitoring will actually be feasible. Similarly, simulated attacks should be used to verify that sufficient forensic data is being gathered to allow effective incident or breach investigation, especially sufficient to be used in legal proceedings.
External penetration tests provide an ideal testing bed to ensure that attacks are detectable with your current deployment. It is something that has to be verified before a breach occurs afterward is too late.
Staff augmentation is the use of technically skilled personnel, usually employees of the staff augmentation provider, needed by the client organization on a contingent assignment basis to perform certain responsibilities. Due to resource restrictions, such as lacking skilled personnel, an organization can utilize staff augmentation to help with implementing SIEM. Some functions for augmented staff can include, but are not limited to, compliance expertise, performance tuning, deployment and threat analysis.