How do you maintain your data catalogs? Do you have a dedicated team for this or do you encourage all users to contribute to maintenance?
Sort by:
In my case we have business glossaries which are hosted on Confluence as we are transforming data into new data products and services on a global scale. Therefore the lead developer and the business must ensure it is up to date in parallel to production releases. By using business glossaries it helps IT teams understand the data from a business lens and we include examples so the document is useful for new joiners, testers and business users. For a more technical lens we use data mapping documents which the lead developer is responsible. Hope this helps!
Yes, we have a dedicated team for Data Catalog - responsible for the platform operation, gatekeeping, governance and enablement. This team is also the contact with the data experts or ambassadors in the divisions to promote the best use of the tool.
Same for us
We are a state health department with over 1k employees and a huge data warehouse. The agency is divided into domain-specific business units (called "divisions"), each focusing on a different aspect of public health. Each division generally uses a domain-specific data source (e.g. Cancer Registry is the main data source used by our Cancer division). Each division has a director, and directors are responsible for designating the data steward for their source system. So the work of documenting human-friendly descriptions for technical assets (i.e. database objects in our data warehouse) is decentralized to the SME. Directors act as data owners and approve those descriptions after the data stewards create them.