We are seeing OpenLineage events that easily exceed > 10MBs resulting in out-of-memory (OOM) errors as facet queries require loading the raw event in memory, then filtering for relevant facets. To avoid querying the lineage_events table for facets, let's group facets in tables by how they will be accessed:
dataset_facets
job_facets
run_facets
We are seeing OpenLineage events that easily exceed > 10MBs resulting in out-of-memory (OOM) errors as facet queries require loading the raw event in memory, then filtering for relevant facets. To avoid querying the
lineage_eventstable for facets, let's group facets in tables by how they will be accessed:dataset_facetsjob_facetsrun_facets