We are looking into Purview for Data Catalog, Data Map capabilities. There were multiple threads ~1 year ago with less positive feedback. Has things changed since then? Has anyone implemented in recent months and how was your experience. We are looking at data sources such as ADLS, Synapse, Power BI and number of SaaS softwares.
Sort by:
Hey, I have implemented it a couple times lightly and attended a Microsoft training. I can tell you that for years it was a headache because tools and permissions kept moving each week. It's stabilized considerably. It's more affordable if you are an enterprise that is 365 based. It connects to anything, but you may have to purchase some connectors. It forces a governance model that is simple and they put a lot of work into the metamodel that I appreciate. While they call it purview, there are still a few things are messy...the data lineage tools were developed separate from the data cataloging tools, so there remain many artifacts and people can get lost if they don't know how to use the top search bar well. It's wonderful if you are handling DLP, records retention, privacy, etc...as you can manage everything all in place.
Purview requires a team and training to implement. Because it's still newer there are few experts. If you are not 365 based, it make little sense. The world is your oyster and there are loads of options.
I presently implement Collibra. There is a large talent pool to draw from. It's data stewardship focused, less IT-focused, and super flexible in terms of defining the governance framework and metamodels. We can quickly set up automated governance workflows and reports to keep the catalog tidy. These aren't things Purview does easily or at all.
If you are only looking for just the basics there are lightweight solutions that cost pennies in comparison (DataGalxy, Talend, etc).
Whatever your tool of choice, be sure to think very hard about your governance operating model and how you will sustain these capabilities over the long term. Few orgs need everything these tools offer. If you don't operationally invest in the adminstration and change management...these initiatives take off strong and die. Determine what you actually want to improve operationally and base your tool selection on that only.
My team tried Purview 2-3 years back and it was quite a challenge to use the tool even within Azure/Microsoft environment. In fact we ended up providing a lot of product improvement suggestions to them. Since then, we selected and implemented Informatica's data catalog which provides many connectors to bring metadata from several systems from in our ecosystem. The challenge we face is in embedding these tools in business's regular use. Developers still depend on direct access into the systems and do not necessarily rely on catalogs for information. Within individual platforms, I see Palantir having a very robust data lineage tool and now databricks is also making strides with Unity Catalog. In meanwhile we have also created a custom soluton that brings metadata from all tools, adds some using LLM's and provides that to our users. Comments on the thread from Heather Fara are spot on...