Skip to content

Add facet tables to avoid querying lineage_events table #2078

@wslulciuc

Description

@wslulciuc

We are seeing OpenLineage events that easily exceed > 10MBs resulting in out-of-memory (OOM) errors as facet queries require loading the raw event in memory, then filtering for relevant facets. To avoid querying the lineage_events table for facets, let's group facets in tables by how they will be accessed:

  • dataset_facets
  • job_facets
  • run_facets

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions