Skip to content

Update v61 migration to handle duplicate job names before unique constraint#2464

Merged
collado-mike merged 2 commits into
mainfrom
fix/61_db_migration_duplicate_jobs
Mar 31, 2023
Merged

Update v61 migration to handle duplicate job names before unique constraint#2464
collado-mike merged 2 commits into
mainfrom
fix/61_db_migration_duplicate_jobs

Conversation

@collado-mike

Copy link
Copy Markdown
Collaborator

Problem

The v61 migration reapplies a unique constraint on job name and namespace without regard to parent job. Unfortunately, there are some cases when job FQNs are duplicated (mostly due to the issue fixed in #2097 , but also including cases when a job previously had no parent and now has a parent, but the same FQN). This update to the migration renames jobs that have been symlinked to point to newer versions of themselves so that the job FQN doesn't conflict and the unique constraint can be applied.

Any installations that have already applied this migration will not see any new operations on their data, but installations that have duplicates will need this fix for the migration to successfully complete.

I tested this on an installation with ~200,000 job names and was able to complete the migration successfully.

Checklist

  • You've signed-off your work
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • You've included a one-line summary of your change for the CHANGELOG.md (Depending on the change, this may not be necessary).
  • You've versioned your .sql database schema migration according to Flyway's naming convention (if relevant)
  • You've included a header in any source code files (if relevant)

@collado-mike collado-mike requested a review from wslulciuc March 31, 2023 00:04
@boring-cyborg boring-cyborg Bot added the api API layer changes label Mar 31, 2023
…traint

Signed-off-by: Michael Collado <collado.mike@gmail.com>
@collado-mike collado-mike force-pushed the fix/61_db_migration_duplicate_jobs branch from a78e9e8 to 9912536 Compare March 31, 2023 00:06
@codecov

codecov Bot commented Mar 31, 2023

Copy link
Copy Markdown

Codecov Report

Merging #2464 (9912536) into main (3b0cde3) will not change coverage.
The diff coverage is n/a.

❗ Current head 9912536 differs from pull request most recent head 0ef8e13. Consider uploading reports for the commit 0ef8e13 to get more accurate results

@@            Coverage Diff            @@
##               main    #2464   +/-   ##
=========================================
  Coverage     83.53%   83.53%           
  Complexity     1207     1207           
=========================================
  Files           231      231           
  Lines          5503     5503           
  Branches        267      267           
=========================================
  Hits           4597     4597           
  Misses          762      762           
  Partials        144      144           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

jobs j
WHERE j.uuid = f.uuid
)
UPDATE jobs SET name=(q.simple_name || '_' || q.row)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will set a job with a conflict to {job_name}_{row_num}?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but row_number() over (PARTITION BY j.namespace_name, j.name ORDER BY j.created_at) AS row means that all jobs that have the same name will have an incrementing counter suffixed so we avoid the conflict. E.g., if we had three jobs with the FQN a.b, the names would be

  1. a.b_1
  2. a.b_2
  3. a.b_3
    Thus, we can add the uniqueness constraint because they'll all end up with different names.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, thanks for clarifying and the examples 👍

@wslulciuc wslulciuc left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your writeup in the PR description was helpful in understanding all that's going on in the migration script. Thanks! And, also, 👍

@collado-mike collado-mike enabled auto-merge (squash) March 31, 2023 21:13
@collado-mike collado-mike merged commit da3863c into main Mar 31, 2023
@collado-mike collado-mike deleted the fix/61_db_migration_duplicate_jobs branch March 31, 2023 21:28
Xavier-Cliquennois pushed a commit to Xavier-Cliquennois/marquez that referenced this pull request Jul 26, 2023
…traint (MarquezProject#2464)

Signed-off-by: Michael Collado <collado.mike@gmail.com>
Signed-off-by: Xavier-Cliquennois <xavier.cliquennois@wearegraphite.io>
jonathanpmoraes referenced this pull request in nubank/NuMarquez Feb 6, 2025
…traint (#2464)

Signed-off-by: Michael Collado <collado.mike@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api API layer changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants