Secure Web Gateway Malware Detection Techniques

Enhanced security is the key functionality in secure Web gateways, yet few buyers understand the different techniques used and their limitations.

Impacts

  • Ubiquitous Internet access for endpoints is the primary channel for attacks on organizations; consequently, malware filtering is the most important capability of secure Web gateway (SWG) solutions.
  • There are numerous techniques for detecting malware in Web traffic; however, they all have limitations and trade-offs.
  • SWGs are incapable of protecting endpoints unless they are in-path of the endpoint traffic.
Recommendations

  • IT organizations should review Web gateway solutions every two to three years to ensure that they are still up to the task of defending endpoints from modern malware.
  • Buyers of new solutions should pay careful attention to the bidirectional malware inspection capabilities of prospective solutions.
Analysis

Bidirectional malware detection in SWGs is critical as malware continues to exploit Web distribution and control methods, and as endpoint protection struggles to keep up with the volume of threats. Organizations must carefully evaluate the malware detection capabilities of existing and prospective solutions to ensure that they are capable of stopping modern targeted attacks.

Use this guide to understand the limitations of solutions, and to compare solutions using standard terminology. Also use it to understand the limitations of each type of malware detection, and to compare prospective vendors' capabilities. Look for solutions that use multiple techniques, particularly those that use dynamic and static code analysis. Test vendor claims with live traffic whenever possible. Also, be sure to test outbound traffic for signs of infection or malware propagation from inside the network.

Look for forensic information about potential targeted inbound threats (that is, new and low-volume) and internal infections. Look for deployment options that protect all endpoints, regardless of network location — for example, mobile endpoints off LAN and small office/home office (SOHO)/branch offices that do not merit infrastructure deployments. In most cases, this will require a cloud-based solution.

Figure 1

Impacts and Top Recommendations for Evaluating Anti-malware Effectiveness

Source: Gartner (June 2012)

Ubiquitous Internet access for endpoints is the primary channel for attacks on organizations; consequently, malware filtering is the most important capability of SWG solutions

Malware threats continue to become more sophisticated, and the quantity and motivation of attackers continue to multiply. The volume and quality of attack kits have increased, enabling less skilled attackers to lease attack code. These kits produce unique, quality malware at low cost. Two-thirds of all Web-based threat activity is attributable to attack kits.1 This indicates that there is a growing population of hackers that is able to leverage the work of more sophisticated programmers and set up its own franchise hacking business (see Note 1).

At the same time, we are seeing a significant increase in more targeted attacks. Large-scale attacks, such as Hydraq/Aurora in January 2010, Stuxnet in June 2010 and Night Dragon in early 2011, have made the news after months or years of stealth residence in sophisticated organizations. Mandiant reported that the median dwell time is greater than 460 days (in "M-Trends 2012: An Evolving Threat"). It has been reported that the recently discovered Flame malware3 may have been introduced as early as 2007, giving it a possible dwell time of more than five years. Targeted organizations have reported detecting six to 10 advanced targeted attacks per day. So, while the attack industry is busy expanding its capabilities, most organizations have not changed their defenses considerably. Consequently, most large organizations have already been breached. Not only do SWGs help prevent breaches from inbound traffic, but also the good ones are capable of detecting breaches based on outbound traffic from compromised endpoints.

Recommendations:

  • IT organizations should review Web gateway solutions every two to three years to ensure that they are still up to the task of defending endpoints from modern malware.
  • Buyers of new solutions should pay careful attention to the bidirectional malware inspection capabilities of prospective solutions.

There are numerous techniques for detecting malware in Web traffic; however, they all have limitations and trade-offs

There is no standard terminology to describe the techniques used; thus, it is difficult to compare the solutions. Moreover, organizations must understand the limitations and trade-offs of each approach.

There are essentially 10 different techniques for malware detection:

  • 1 Block lists
  • 2 Signature detection
  • 3 Domain or IP reputation
  • 4 Static code analysis (heuristics)
  • 5 Vulnerability shields
  • 6 Dynamic code analysis (behavioral monitoring)
  • 7 Network traffic analysis
  • 8 Content analysis
  • 9 Custom rules
  • 10Policy-based controls

Block lists are simply lists of known "malicious" URLs or IP addresses. They are typically created by crawling the Web in search of malware, or by forensic analysis of malware attacks. To be useful, these lists need to be real-time look-ups of a master list to get the latest information, and they need to be purged of sites that have been cleaned up. One major problem is that an increasing percentage of sites require some level of user authentication; thus, they are invisible to Web crawlers. The other major problem is that sites can switch from good to bad in seconds; however, it takes significant time to refresh verdicts using Web crawlers.

Signature detection leverages a signature database of known malware to inspect files as they cross the gateway. This provides a fast, accurate method to catch known malware. However, latency is a major issue with signatures, especially for very large files. The other major issue with this approach is that signatures are only 95% effective with known malware, and only about 20% to 50% effective with new variants. Custom malware will likely pass signature scans, and it is easy for hackers to test their code against well-known signatures. Vendors that license malware detection engines will likely have less capability to seamlessly integrate these engines, and they might be several versions behind the most recent engine version.

Block lists and signatures are common, but they are not enough. More advanced solutions should use the real-time in-line analysis described below, which can identify and block unknown threats.

Domain or IP reputation is a slight twist on the block list. Sites with a bad "reputation" are not yet known to be malicious, but based on historical behavior or site characteristics, there is a good chance that they will be malicious in the future. Reputation analysis uses a number of static characteristics of the site to determine a reputation score. Examples of characteristics would include such indicators as the age of the site, the historical malware prevalence of the host or the network, the registration details, the size of the site, the reputation of the sites that link to/from the site and the type of Web server. The accuracy of the reputation score is dependent on the number of parameters used and the algorithm for scoring the site. SWGs should allow some tuning of the reputation score that would convict a website.

Advanced solutions will use static code analysis (or heuristic analysis) to inspect Web code as it crosses the gateway looking for signs of potential infection. Hackers often leave telltale signs of infection by using common techniques and reusing code snippets (for example, JavaScript for a particularly malicious function). Detection of these "tells" can convict a Web page. Even careful hackers can get caught by a rule-based approach to static code analysis that simply prohibits certain types of commands and operations. The challenge with static code analysis is that almost every vendor claims to do some, but it is impossible to verify how extensive their analysis and rule base are. A revealing metric is often the size and quality of the research team working on rules for this engine. Another would be the frequency of updates.

Vulnerability shielding is a particular type of static code analysis that focuses on detecting exploits aimed at a specific code weakness in unpatched software.

Dynamic code analysis (or behavior analysis) is a method of actually running suspect code in a virtual machine and observing its behavior. If the behavior violates predefined rules, then the suspect code is blocked. This approach can be slow; consequently, current solutions only use this technique on very suspicious code, and they tend to deliver the code to the endpoint before the test is complete. Although dynamic testing is very accurate at detecting malicious code, it cannot determine whether the exploit has succeeded at the endpoint, and can cause false-positive alerts. Dynamic code analysis is often used by malware labs, so attackers often use tricks like long sleep delays and virtual machine detection to thwart this type of analysis. The effectiveness of dynamic detection depends on the similarity of the virtual machine configuration to the deployed endpoints (including virtual desktop infrastructure and mobile devices) and the type of behavior the detection engine is looking for. Malware can then simply emulate good application behavior.

Network traffic analysis can detect malicious code, such as remote access trojans (RATs) and bot command and control communication traffic, between attackers and their victims. This type of detection indicates that an endpoint is already compromised; as such, this is the most important malware detection technique. To be effective, network traffic monitoring must have visibility across all ports and protocols. However, proxies are often limited to HTTP traffic only. Proxies must have other means to inspect non-HTTP/s traffic. In-line devices have more visibility into non-HTTP traffic. Secure Sockets Layer (SSL) inspection is critical to detection in encrypted SSL traffic, but this requires the ability to selectively terminate and inspect SSL traffic. We anticipate that traffic analysis will become more difficult as attackers begin to hide instructions in HTTP Get and Post traffic types, which will be difficult to distinguish from typical Web traffic. Network traffic analysis must have seen and analyzed the threat traffic before it can spot it on the network, so it will be less effective on new or unknown threats. Traffic monitoring can also incorporate anomaly detection, which detects deviations in traffic patterns, such as surges of traffic, changes in destinations or pulses of traffic that are too standardized.

Content analysis in traffic can reveal the exportation of sensitive information, such as password hashes or documents, which may indicate exfiltration of content; however, the content that this technique is looking for needs to be narrowly defined to reduce the false-positive rate. Content inspection can also detect steganography, where content is hidden inside other content types.

Malware research labs will use numerous custom rules and techniques that are effective for particular threats or threat types — for example, injecting custom cookies to detect cross-site scripting. Evaluating the quality and quantity of these rules is very difficult. Here, again, the size and reputation of the malware research organization will be the best indicators.

Policy-based controls are predefined configuration rules that reduce the potential attack surface of a browsing session. Examples of policy-based controls would be restricting traffic to specific browser types (for example, block Web traffic except from Internet Explorer 8 or higher) or applications (for example, block Skype and LogMeIn). The majority of exploits target vulnerabilities in Internet-facing applications, so restricting applications to up-to-date versions would be another policy-based approach.

Better solutions will use a multipath scoring mechanism, which intelligently routes traffic to multiple simultaneous verdict engines, each of which results in a score. High scores in any individual engine are likely enough to identify malicious content and exit the routine, but medium scores across a number of verdict engines should also be sufficient to convict content or at least require further analysis or restrictions. These thresholds should be administrator-configurable so that organizations with differing security needs can set their own tolerance thresholds.

Vendor claims of effectiveness should be taken with due skepticism and validated with testing in production traffic (see Note 2). Severity, context and forensic analysis of threats and trending information are features that can significantly enhance the value of an SWG (see Note 3).

Recommendations:
  • Use this guide to understand the limitations of each type of malware detection, and to compare prospective vendors' capabilities. Look for solutions that use multiple techniques, particularly those that use dynamic and static code analysis.
  • Test vendor claims with live traffic whenever possible. Also, be sure to test outbound traffic for signs of infection or malware propagation from inside the network. Look for forensic information about potential targeted inbound threats (that is, new and low-volume) and internal infections.
SWGs are incapable of protecting endpoints unless they are in-path of the endpoint traffic

Network-based SWGs offer an opportunity to protect an increasingly diverse array of endpoint types from Internet-born malware, and provide an early warning system when breaches occur; however, they are incapable of protecting endpoints unless endpoint traffic is somehow redirected to the SWG.

Most IT organizations focus protection on Windows PCs, but the recent Flashback Trojan illustrated that Macs are also vulnerable to exploits. In the future, we fully expect to see more mobile threats on iOS, Android and Windows 8 tablets and phones. One of the distinct advantages of SWGs is that they are capable of protecting all endpoints in a standard way, without a heavy client footprint. However, SWGs can only do so if they are in-path. Today, fewer than 30% of organizations force mobile clients and smaller branch offices back to an SWG on corporate networks when they are roaming,4 and few (if any) mobile phones on cellular networks are in-path of corporate SWGs; consequently, these endpoints do not benefit from network-level malware protection. Laptops will represent 72% of PC shipments,5 and more than 300 million tablets will ship in 2015.6 Clearly, an increasing percentage of endpoints will be mobile and not tethered to a specific network. Cloud-based SWGs are more capable of inspecting endpoint traffic, regardless of its location, but redirection tablets and phones — especially if they are employee-owned — are problematic.

Recommendations:
  • Look for deployment options that protect all endpoints, regardless of network location — for example, mobile endpoints off LAN and SOHO/branch offices that do not merit infrastructure deployments. In most cases, this will require a cloud or hybrid-type solution.
  • Employee-owned devices will be difficult to force to any SWG concentration point without some level of user cooperation. Tablets and mobile phones will be difficult to integrate into an SWG solution. Test these platforms and the redirection methods to validate vendor claims.
Evidence

1 Symantec "Internet Security Threat Report," Volume 16, published April 2011

2 From Blue Coat Systems' "2011 Mid-Year Web Security Report"

3 See http://en.wikipedia.org/wiki/Flame_(malware)

4 Gartner SWG reference customer survey of 75 organizations conducted in 1Q12

5 "Market Trends: Worldwide, Desk-Based PCs Are Battling On, 2012"

6 "Market Trends: Low-Cost Application Processors Will Drive the Growth of Media Tablets and Derivatives, Worldwide, 2012"

Source: Gartner RAS Core Research Note G00232493, P. Firstbrook, 20 June 2012
Note 1
Malnets

To get a sense of the organization and the scale of modern malware, consider the growing size of modern malnets. Malnets are networks of compromised hosts (that is, websites) that are under the control of a single attacker. Like a hunter's trap line, they are used to lure victims and distribute malware to unsuspecting endpoints. For the first half of 2011, the Shnakule malnet had an average of 2,000 unique malicious hosts per day available to lure and infect users.2 It is likely that Shnakule had an average of more than 80,000 requests per day, or more than 2.5 million visits per month.

Note 2

Testing SWGs

Some solutions turn off advanced malware detections by default due to impacts on performance. Some vendors selectively deploy more advanced techniques only if other indicators raise the level of suspicion. This is acceptable, but it is important to know what techniques are being used by default and which ones are being used selectively. Ideal solutions allow for in-path or daisy chained proxy-monitor-mode-only deployments, which ease testing. If production testing is not possible, then some organizations use attack kits, such as Metasploit, to create their own unique attacks to test solutions in a lab environment. As an alternative to testing, organizations must consider factors such as the size and depth of expertise of the malware research organization, as well as the amount and age of data that the research organization has access to mine to detect new trends and experiment with new detection techniques. In general, having a reasonable number of skilled technical researchers (for example, 10 to 20) with long tenure is more valuable than having a revolving cast of recent graduates. The percentage of threats that are detected using advanced techniques, as opposed to block lists and signatures, is also a good indicator of more advanced malware detection capabilities. Better SWGs catch more than 90% of threats using proprietary real-time techniques.

Note 3

Malware Information

The ability to advise incident response teams about the nature of the threat, remediation techniques, severity, prevalence, propagation and impact is hugely valuable. Threat information, which indicates compromised devices on the network, is crucial to the role of SWGs acting as a backstop to client-based protection. Trending information can help IT organizations understand and communicate the effectiveness of security controls. Forensics and incident response information are generally weak in current solutions in this market, but are starting to improve.