Skip to content

improve dataset facets access#2407

Merged
wslulciuc merged 3 commits into
mainfrom
ol-facets/improve-dataset-facets-access
Feb 22, 2023
Merged

improve dataset facets access#2407
wslulciuc merged 3 commits into
mainfrom
ol-facets/improve-dataset-facets-access

Conversation

@pawel-big-lebowski

Copy link
Copy Markdown
Collaborator

Signed-off-by: Pawel Leszczynski leszczynski.pawel@gmail.com

Problem

The newly introduced dataset_facets table contains dataset_version_uuid field which allows a better way to access dataset facets (without chained join dataset -> run -> facet).

Closes: #2406

Solution

Rewrite SQL queries in DatasetDao and DatasetVersionDao.

Note: All database schema changes require discussion. Please link the issue for context.

Checklist

  • You've signed-off your work
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • You've updated the CHANGELOG.md with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary)
  • You've versioned your .sql database schema migration according to Flyway's naming convention (if relevant)
  • You've included a header in any source code files (if relevant)

@boring-cyborg boring-cyborg Bot added the api API layer changes label Feb 6, 2023
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
@codecov

codecov Bot commented Feb 7, 2023

Copy link
Copy Markdown

Codecov Report

Merging #2407 (67babf4) into main (89bb97a) will increase coverage by 9.95%.
The diff coverage is n/a.

❗ Current head 67babf4 differs from pull request most recent head be4e649. Consider uploading reports for the commit be4e649 to get more accurate results

@@             Coverage Diff              @@
##               main    #2407      +/-   ##
============================================
+ Coverage     67.15%   77.11%   +9.95%     
- Complexity      231     1234    +1003     
============================================
  Files            40      228     +188     
  Lines           947     5572    +4625     
  Branches        101      447     +346     
============================================
+ Hits            636     4297    +3661     
- Misses          163      775     +612     
- Partials        148      500     +352     
Impacted Files Coverage Δ
api/src/main/java/marquez/db/DatasetDao.java 98.64% <ø> (ø)
...pi/src/main/java/marquez/db/DatasetVersionDao.java 95.83% <ø> (ø)
.../src/main/java/marquez/service/models/JobMeta.java 73.33% <0.00%> (ø)
.../main/java/marquez/service/OpenLineageService.java 91.17% <0.00%> (ø)
...i/src/main/java/marquez/service/models/NodeId.java 72.13% <0.00%> (ø)
...quez/api/exceptions/RunAlreadyExistsException.java 100.00% <0.00%> (ø)
...rc/main/java/marquez/db/models/ExtendedRunRow.java 66.66% <0.00%> (ø)
...src/main/java/marquez/api/models/SearchResult.java 60.00% <0.00%> (ø)
...java/marquez/db/mappers/PairUuidInstantMapper.java 75.00% <0.00%> (ø)
... and 180 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@wslulciuc

Copy link
Copy Markdown
Member

@pawel-big-lebowski: I'm wondering if we can also have stats for the query improvements along with the PR?

@collado-mike

Copy link
Copy Markdown
Collaborator

@pawel-big-lebowski: I'm wondering if we can also have stats for the query improvements along with the PR?

+1 - explain plans for before/after are also super helpful

@pawel-big-lebowski pawel-big-lebowski self-assigned this Feb 14, 2023

@wslulciuc wslulciuc left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice optimization to avoid JOINs and a bit of cleanup! Great work, @pawel-big-lebowski 💯 🥇

@wslulciuc wslulciuc enabled auto-merge (squash) February 22, 2023 06:11
@wslulciuc wslulciuc merged commit dbdbcc3 into main Feb 22, 2023
@wslulciuc wslulciuc deleted the ol-facets/improve-dataset-facets-access branch February 22, 2023 06:19
jonathanpmoraes referenced this pull request in nubank/NuMarquez Feb 6, 2025
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Co-authored-by: Willy Lulciuc <willy@datakin.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api API layer changes docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimise fetching facets in DatasetDao

3 participants