Published: 23 July 2014
Summary
The data lake concept promises a centralized pool of disparate data sources in one location, and treats alignment as a technical exercise. Information management leaders should understand the gaps in this concept — such as semantics, governance and security — and take the necessary precautions.
Included in Full Research
- Data lakes focus on storing data from disparate sources and ignore how or why data is used, governed, defined and secured by an organization's information management leaders
- End users are misinformed on the skill level required to capitalize on the data lake concept. Vendors are exploiting the hype with no intent to resolve the lack of programming, analytical and data manipulation skills necessary to improve specific business outcomes.
- Data lakes typically begin as ungoverned data stores addressing a limited data science audience. Meeting the needs of wider audiences requires curated repositories with governance, semantic consistency and security — elements already found in data warehouses.