Skip to content

Use io.openlineage.server.* pkg and add class Metadata#2853

Merged
wslulciuc merged 18 commits into
mainfrom
feature/openlineage-server-models-pkg
Oct 9, 2024
Merged

Use io.openlineage.server.* pkg and add class Metadata#2853
wslulciuc merged 18 commits into
mainfrom
feature/openlineage-server-models-pkg

Conversation

@wslulciuc

@wslulciuc wslulciuc commented Jul 12, 2024

Copy link
Copy Markdown
Member

This PR removes the OpenLineage models that were first introduced in the PoC in favor of io.openlineage.server.* defined specifically for consumers. This PR also introduces classes Metadata and VersionId. Below, we outlined more specifically on their importance and usage.

class Metadata

Wrapper class for parsing:

  • class OpenLineage.RunEvent
  • class OpenLineage.JobEvent
  • class OpenLineage.DatasetEvent

Methods:

  • Metadata.Run.forEvent(OpenLineage.RunEvent): Metadata.Run
  • Metadata.Job.forEvent(OpenLineage.JobEvent): Metadata.Job
  • Metadata.Dataset.forEvent(OpenLineage.DatasetEvent): Metadata.Dataset

class VersionId

Factory class for VersionIds.

Methods:

  • VersionId.forJob(): VersionId
  • VersionId.forDataset(): VersionId

Closes: #1650

Signed-off-by: wslulciuc <willy@datakin.com>
@boring-cyborg boring-cyborg Bot added the api API layer changes label Jul 12, 2024
@netlify

netlify Bot commented Jul 12, 2024

Copy link
Copy Markdown

Deploy Preview for peppy-sprite-186812 canceled.

Name Link
🔨 Latest commit 21e1ecb
🔍 Latest deploy log https://app.netlify.com/sites/peppy-sprite-186812/deploys/6706e95795ea9d0008303d1d

@Builder
@ToString
@EqualsAndHashCode
public static final class Dataset {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth including column lineage as well as the integration type.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this PR has been out for a while, and the Metadata class isn't yet used within the codebase, I'll add column lineage facet extraction as follow up.

@wslulciuc wslulciuc added this to the 0.50.0 milestone Aug 23, 2024
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
…`, and more

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
@wslulciuc

Copy link
Copy Markdown
Member Author

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
@codecov

codecov Bot commented Oct 3, 2024

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 64.95957% with 130 lines in your changes missing coverage. Please review.

Project coverage is 82.15%. Comparing base (c7e0d63) to head (21e1ecb).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
api/src/main/java/marquez/api/models/Metadata.java 77.16% 64 Missing and 2 partials ⚠️
...ain/java/marquez/api/exceptions/FacetNotValid.java 0.00% 57 Missing ⚠️
.../src/main/java/marquez/common/models/RunState.java 40.00% 5 Missing and 1 partial ⚠️
...pi/src/main/java/marquez/api/models/VersionId.java 92.85% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2853      +/-   ##
============================================
- Coverage     83.08%   82.15%   -0.93%     
- Complexity     1500     1504       +4     
============================================
  Files           265      268       +3     
  Lines          6888     7258     +370     
  Branches        320      325       +5     
============================================
+ Hits           5723     5963     +240     
- Misses         1007     1134     +127     
- Partials        158      161       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
…sOnly()`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
@wslulciuc wslulciuc marked this pull request as ready for review October 9, 2024 20:20

@phixMe phixMe left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! This is going to improve the write path and improve interacting with and enhancing the core events with updates to the OL Spec.

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
@wslulciuc wslulciuc enabled auto-merge (squash) October 9, 2024 20:37
@wslulciuc wslulciuc merged commit 423f0dd into main Oct 9, 2024
@wslulciuc wslulciuc deleted the feature/openlineage-server-models-pkg branch October 9, 2024 20:46
jonathanpmoraes referenced this pull request in nubank/NuMarquez Feb 6, 2025
* Use `io.openlineage.server.*` pkg and add class `Metadata`

Signed-off-by: wslulciuc <willy@datakin.com>

* Add back `models.BaseEvent`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Add `MetadataTest.testNewRun()`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Add `testNewRunWithParent()`, `testNewRunWithNominalStartAndEndTime()`, and more

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Add `Metadata.Run.ParentRun.Job.Id`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Add `testNewRunWithIO()`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Assert IO dataset name and source

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Add `testNewRunWithInputDatasetsOnly()`, `testNewRunWithOutputDatasetsOnly()`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Define `Metadata.Job.location` as `URL`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Add assertions for facets and improved readability

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Cleanup `Metadata.Job` instantiation

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Update test suite

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* continued: Update test suite

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* continued: Update test suite

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Clean up docs for class `Metadata`

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Apply spotless

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

* Add copyright header

Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>

---------

Signed-off-by: wslulciuc <willy@datakin.com>
Signed-off-by: Willy Lulciuc <willy.lulciuc@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api API layer changes in progress

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Binary incompatibility after deleting LineageEvent classes

2 participants