Ensure job data in lineage query is not null or empty#2253
Merged
Conversation
Signed-off-by: wslulciuc <willy@datakin.com>
Codecov Report
@@ Coverage Diff @@
## main #2253 +/- ##
============================================
- Coverage 77.07% 76.99% -0.08%
- Complexity 1170 1171 +1
============================================
Files 222 222
Lines 5317 5326 +9
Branches 425 426 +1
============================================
+ Hits 4098 4101 +3
- Misses 747 751 +4
- Partials 472 474 +2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Signed-off-by: wslulciuc <willy@datakin.com>
collado-mike
requested changes
Dec 5, 2022
Signed-off-by: wslulciuc <willy@datakin.com>
Signed-off-by: wslulciuc <willy@datakin.com>
Signed-off-by: wslulciuc <willy@datakin.com>
collado-mike
requested changes
Dec 12, 2022
…ided Signed-off-by: wslulciuc <willy@datakin.com>
Signed-off-by: wslulciuc <willy@datakin.com>
collado-mike
approved these changes
Dec 13, 2022
jonathanpmoraes
referenced
this pull request
in nubank/NuMarquez
Feb 6, 2025
* Ensure job data in lineage query is not null or empty Signed-off-by: wslulciuc <willy@datakin.com> * continued: Ensure job data in lineage query is not null or empty Signed-off-by: wslulciuc <willy@datakin.com> * Add toLineageWithOrphanDataset() to build orphan graph Signed-off-by: wslulciuc <willy@datakin.com> * continued: Add toLineageWithOrphanDataset() to build orphan graph Signed-off-by: wslulciuc <willy@datakin.com> * continued: Add toLineageWithOrphanDataset() to build orphan graph Signed-off-by: wslulciuc <willy@datakin.com> * Return orphan graph on failed lookup for job when dataset nodeID provided Signed-off-by: wslulciuc <willy@datakin.com> Signed-off-by: wslulciuc <willy@datakin.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
An unknown edge case in our lineageAPI exists when querying for a job
uuidthat has no lineage when callingLineageDao.getLineage(), yet is associated with a given datasetD. That is, given a datasetD, a reverse job lookup is performed usingLineageDao.getJobFromInputOrOutput()(which returns the jobuuidthat produced datasetD) but when querying for the lineage data for the returned job an empty set is returned, therefore resulting in a backend exception (=5xxstatus code) when invokingLineageDao.getJobFromInputOrOutput().Solution
This PR ensures the lineage data for a job is not empty (an empty lineage graph is returned instead) and introduces logs to better understand the scenario in which this unknown edge case occurs for debugging purposes. This PR also adds a API check to ensure the query param
nodeIDprovided in the lineageAPI call exists.Checklist
CHANGELOG.mdwith details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary).sqldatabase schema migration according to Flyway's naming convention (if relevant)