Are there any AI tools available to generate Data Models from a mapping document?
Sort by:
Mapping documents typically describe relationships between business concepts, source/target fields, or legacy-to-modern data mappings. AI tools can ingest such documents and automate:
1. Entity Recognition - Identifying tables, fields, relationships.
2. Schema Inference - Structuring logical or physical schemas (SQL, JSON Schema, etc.).
3. Relationship Mapping - Detecting foreign keys and hierarchical dependencies.
4. Model Generation - Outputting diagrams (ERD), SQL DDL, or conceptual data models.
I'm not aware of any off-the-shelf solutions, but it is a pretty straightforward custom buildout using LLMs. Example custom solutions:
1. Microsoft Power Platform with Copilot (Power Apps + Dataverse)
- Supports AI-assisted table creation from natural language and Excel-style inputs.
- Power Automate and Power BI integrate to reflect changes across data pipelines.
2. Snowflake Cortex + Streamlit
- Snowflake Cortex can be used to create custom agents and copilots that interpret mapping documents.
- Streamlit apps (inside Snowflake) can be built to visualize and refine schema.
Brad <br><br>Great information. I agree that it should be pretty straightforward to create a model from a map. I would want some upfront validation that model is reflective of current state in systems. <br><br>Looking to see if you (or any one else) have worked on this from the other direction, where we have AI create a mapping document based on existing schema? We spend a lot of time creating them as we deploy ERPs. Our legacy systems are maybe 80% common, the last 20% is unique. Thus for each deployment we have to create new maps. <br>Have you seen any use of AI to assist with data cleansing? Again, in our legacy systems we have a lot of non-conforming data. Seem this data cleansing could be something AI would be useful to help.
I regularly have to create schemas and map entity relationship diagrams from disperate systems, most of the time not knowing what the data is, or even where it is coming from - so how I do varies slighty from workload to workload.<br><br>With that said, have you heard of Mermaid.JS? It is a JSON-like visualization markup langauge for creating ERDs, Mindmaps, and all kinds of great flows, and most of the large LLMs work great with. it. Things I used to spend horus mapping in Visio or Lucid chart, now take seconds. <br><br>So, sometimes I'll dump the various database schemas to a .sql file, upload it and other things like CSVs, or XML to ChatGPT 4o, or Claud 3.7, Geminie 2.5, or Microsoft Copilot, and I'll ask it to create an ERD and/or Schema diagram for me, with an output of Mermaid.JS. They'll I'll dum the Mermaid.JS code into mermaid.live or to a Mermaid /code component in a Microsoft Loop page.<br><br>Once I visualize the schema and/or ERD, I can then work on how I am going to transform that data in a cohesive manner. When cleaning data, the goal is to get a clean ERD as soon as possible, then you can automate a ton of things using that as the northern star. Luckily, again, LLMs are very good at reading SQL, CSV, XML, etc. and easily output to things like Mermaid with ease. <br><br>I've done this successfully with massive databases and schemas, with thousands of tables and hundreds of columns. As you work with it more, you get more familiar with the quirks and workarounds. At some point someone may make an off-the-shelf solution to do this, but since the data is so specific to a use case, I'll likely take this similar approach to getting my hands durty. <br><br>In the end, I think it helps me understand the data more anyway.
Yes, several AI-powered and traditional tools can help generate data models from mapping documents, such as spreadsheets, CSVs, or business glossary files. These tools streamline the creation of entity-relationship diagrams (ERDs), database schemas, or data catalogs from your mappings. Here's a breakdown of categories and examples:
AI-Powered & Automation Tools to Generate Data Models
1. Microsoft Power BI + Fabric + Copilot
Use case: Auto-detects relationships and tables from Excel/CSV files.
AI boost: Power BI Copilot can suggest measures, relationships, and visual models.
Bonus: Integrates with Dataverse and Microsoft Fabric for full data pipelines.
2. ChatGPT + Python (Custom)
How: You upload your mapping document (e.g., in Excel), and ask ChatGPT to:
Parse tables and field mappings
Suggest an ERD or schema (can output SQL or diagram code like Mermaid or PlantUML)
Best for: Prototyping or reverse-engineering data models with context.
3. Erwin Data Modeler by Quest
Feature: Import from spreadsheets, reverse-engineer from databases.
AI/ML assist: Pattern recognition, naming conventions, and model validation.
Best for: Enterprise data modeling and governance.
4. Lucidchart or dbdiagram.io with AI Import Helpers
Lucidchart: Imports CSV or JSON to auto-generate ERDs.
dbdiagram.io: Supports DSL-based modeling from simple text input. AI plug-ins or ChatGPT can help convert mappings to this DSL.
5. Databricks Unity Catalog + AI Assistants
For: Data lakes and structured data projects.
Benefit: AI assistant helps translate business mapping into metadata and table structures.