Key features
- OpenLineage-compatible API: receives lineage events from all OpenLineage integrations
- Lineage graph visualization: interactive job and dataset dependency graphs
- Dataset version tracking with schema history across APAC pipeline runs
- Job run history with input/output metrics and data quality facets
- REST API for programmatic APAC lineage queries and impact analysis
- Docker Compose deployment for rapid APAC self-hosted setup
Best for
- APAC data engineering teams that need a lightweight, self-hosted lineage backend for OpenLineage events without deploying a full enterprise data catalog.
Limitations to know
- ! Lineage-focused — lacks data catalog features like business glossary, stewardship, access control
- ! UI is functional but less polished than commercial APAC alternatives
- ! Scales to mid-size APAC deployments; very large pipeline volumes may require tuning
About Marquez
Marquez is an open-source metadata service from the Linux Foundation AI & Data project that serves as the reference implementation backend for OpenLineage. APAC data engineering teams deploy Marquez to receive OpenLineage events from their Spark, Airflow, and dbt pipelines, storing job run history, dataset schemas, and data lineage graphs in a queryable API with a web-based visualization interface.
Marquez's data model organizes lineage around namespaces, jobs, and datasets — reflecting how APAC data pipelines actually work: a Spark job in the 'apac-payments' namespace reads from 'raw_transactions' and writes to 'stg_payments', and Marquez stores this relationship with the schema of each dataset and the metadata of each job run. APAC data teams use Marquez to investigate data quality incidents by tracing backwards through the lineage graph: if the 'fct_apac_revenue' table has incorrect data, which upstream APAC jobs and sources could be responsible?
For APAC teams already using DataHub or OpenMetadata as their primary data catalog, Marquez serves as a lightweight lineage-specific backend focused on the OpenLineage API. For APAC teams starting their data governance journey without an existing catalog, Marquez provides an immediately operational lineage store with minimal infrastructure requirements (Docker Compose deployment, PostgreSQL backend).
Beyond this tool
Where this category meets practice depth.
A tool only matters in context. Browse the service pillars that operationalise it, the industries where it ships, and the Asian markets where AIMenta runs adoption programs.
Other service pillars
By industry