Analysts to Discuss the Impact of Big Data at the Gartner Business Intelligence & Analytics Summit 2014, October 21-22 , Munich, Germany
With so much hype about big data, it's hard for IT leaders to know how to exploit its potential. Gartner, Inc. dispels five myths to help IT leaders evolve their information infrastructure strategies.
"Big data offers big opportunities, but poses even bigger challenges. Its sheer volume doesn't solve the problems inherent in all data," said Alexander Linden, research director at Gartner. "IT leaders need to cut through the hype and confusion, and base their actions on known facts and business-driven outcomes."
Myth No. 1: Everyone Is Ahead of Us in Adopting Big Data
Interest in big data technologies and services is at a record high, with 73 percent of the organizations Gartner surveyed in 2014 investing or planning to invest in them. But most organizations are still in the very early stages of adoption — only 13 percent of those we surveyed had actually deployed these solutions (see Figure 1).
Figure 1. The Stages of Big Data Adoption, 2013 and 2014
Note: Gartner asked survey respondents "Which of the five stages best describes your organization’s stage of big data adoption?"
In 2014, n = 302. In 2013, n = 720. Source: Gartner (September 2014)
The biggest challenges that organizations face are to determine how to obtain value from big data, and how to decide where to start. Many organizations get stuck at the pilot stage because they don't tie the technology to business processes or concrete use cases.
Myth No. 2: We Have So Much Data, We Don't Need to Worry About Every Little Data Flaw
IT leaders believe that the huge volume of data that organizations now manage makes individual data quality flaws insignificant due to the "law of large numbers." Their view is that individual data quality flaws don't influence the overall outcome when the data is analyzed because each flaw is only a tiny part of the mass of data in their organization.
"In reality, although each individual flaw has a much smaller impact on the whole dataset than it did when there was less data, there are more flaws than before because there is more data," said Ted Friedman, vice president and distinguished analyst at Gartner. "Therefore, the overall impact of poor-quality data on the whole dataset remains the same. In addition, much of the data that organizations use in a big data context comes from outside, or is of unknown structure and origin. This means that the likelihood of data quality issues is even higher than before. So data quality is actually more important in the world of big data."
Myth No. 3: Big Data Technology Will Eliminate the Need for Data Integration
The general view is that big data technology — specifically the potential to process information via a "schema on read" approach — will enable organizations to read the same sources using multiple data models. Many people believe this flexibility will enable end users to determine how to interpret any data asset on demand. It will also, they believe, provide data access tailored to individual users.
In reality, most information users rely significantly on "schema on write" scenarios in which data is described, content is prescribed, and there is agreement about the integrity of data and how it relates to the scenarios.
Myth No. 4: It's Pointless Using a Data Warehouse for Advanced Analytics
Many information management (IM) leaders consider building a data warehouse to be a time-consuming and pointless exercise when advanced analytics use new types of data beyond the data warehouse.
The reality is that many advanced analytics projects use a data warehouse during the analysis. In other cases, IM leaders must refine new data types that are part of big data to make them suitable for analysis. They have to decide which data is relevant, how to aggregate it, and the level of data quality necessary — and this data refinement can happen in places other than the data warehouse.
Myth No. 5: Data Lakes Will Replace the Data Warehouse
Vendors market data lakes as enterprisewide data management platforms for analyzing disparate sources of data in their native formats.
In reality, it's misleading for vendors to position data lakes as replacements for data warehouses or as critical elements of customers' analytical infrastructure. A data lake's foundational technologies lack the maturity and breadth of the features found in established data warehouse technologies. "Data warehouses already have the capabilities to support a broad variety of users throughout an organization. IM leaders don't have to wait for data lakes to catch up," said Nick Heudecker, research director at Gartner.
Additional information is available in the Gartner reports "Major Myths About Big Data's Impact on Information Infrastructure" and "Major Myths About Big Data's Impact on Analytics," which can be found at http://www.gartner.com/document/2846217 and http://www.gartner.com/document/2846318.
Gartner analysts will take a deeper look at the impact of big data at the Gartner Business Intelligence & Analytics Summit 2014, October 21-22 in Munich, Germany. More information on the Summit can be found at http://www.gartner.com/technology/summits/emea/business-intelligence-germany/agenda.jsp.
Information from the Summit will be shared on Twitter at http://twitter.com/Gartner_inc using #GartnerBI.
Gartner, Inc. (NYSE: IT) is the world's leading research and advisory company. The company helps business leaders across all major functions in every industry and enterprise size with the objective insights they need to make the right decisions. Gartner's comprehensive suite of services delivers strategic advice and proven best practices to help clients succeed in their mission-critical priorities. Gartner is headquartered in Stamford, Connecticut, U.S.A., and has more than 13,000 associates serving clients in 11,000 enterprises in 100 countries. For more information, visit www.gartner.com.
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.