Gartner Research

Overcoming Data Quality Risks When Using Semistructured and Unstructured Data for AI/ML Models

Published: 06 December 2022

Summary

Ensuring quality of semistructured and unstructured data for machine learning provides unique challenges for data and analytics technical professionals. This research investigates the challenges and risks related to data quality for AI and ML models, and how to overcome or mitigate them.

Included in Full Research

Overview

Key Findings
  • For semistructured and unstructured data, organizations have most thoroughly addressed one data quality dimension: timeliness. Other data quality dimensions are poorly addressed in terms of both technology and organization practices required to support those other data quality dimensions.

  • Most unstructured data will remain out of reach to data consumers and unused or unusable until accessibility issues have been addressed.

  • Accuracy and relevance of unstructured data for AI and ML is driven entirely by its use case. Therefore, independent assessment and validation of unstructured data is impossible prior to defining the use case.

  • Unstructured data management creates a paradox for technical

Clients can log in to view the entire document.

Analysts:

Jason Medd

Access Research

Already a Gartner client?

To view this research and much more, become a client.

Speak with a Gartner specialist to learn how you can access peer and practitioner research backed by proprietary data, insights, advice and tools to help you achieve stronger performance.

By clicking the "Continue" button, you are agreeing to the Gartner Terms of Use and Privacy Policy.

Gartner research: Trusted insight for executives and their teams

What is Gartner research?

Gartner research, which includes in-depth proprietary studies, peer and industry best practices, trend analysis and quantitative modeling, enables us to offer innovative approaches that can help you drive stronger, more sustainable business performance.

Gartner research is unique, thanks to:

Independence and objectivity

Our independence as a research firm enables our experts to provide unbiased advice you can trust.

Actionable insights

Not only is Gartner research unbiased, it also contains key take-aways and recommendations for impactful next steps.

Proprietary methodologies

Our research practices and procedures distill large volumes of data into clear, precise recommendations.

Gartner research is just one of our many offerings.

We provide actionable, objective insight to help organizations make smarter, faster decisions to stay ahead of disruption and accelerate growth.

Tap into our experts

We offer one-on-one guidance tailored to your mission-critical priorities.

Pick the right tools and providers

We work with you to select the best-fit providers and tools, so you avoid the costly repercussions of a poor decision.

Create a network

Connect directly with peers to discuss common issues and initiatives and accelerate, validate and solidify your strategy.

Experience Information Technology conferences

Join your peers for the unveiling of the latest insights at Gartner conferences.

©2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. This publication may not be reproduced or distributed in any form without Gartner’s prior written permission. It consists of the opinions of Gartner’s research organization, which should not be construed as statements of fact. While the information contained in this publication has been obtained from sources believed to be reliable, Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research may address legal and financial issues, Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are governed by Gartner’s Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its research organization without input or influence from any third party. For further information, see Guiding Principles on Independence and Objectivity.