0
Open Lineage for Data Trust and UnderstandingOne of the most requested metadata use cases is lineage. This is the ability to understand the origin of your data and the processing (reformatting, enrichment, merging, ...) it has gone through between the data's origin and your AI model. Lineage helps to build trust in your model since it shows you have used appropriate data. Many individual technologies provide some lineage support that covers its own processing. Some data catalogs provide proprietary ways to gather lineage from many sources. However this is expensive to implement and only makes the lineage information available through the data catalog. Now three open source projects from LF AI and Data have come together to create a truely open ecosystem for lineage. Egeria provides open metadata that describes the data sources, data structures, data profiling results and the data pipelines. OpenLinege provides the event mechanism that records each time a data pipeline runs. Marquez provides visualization for lineage. In this talk you will learn about: