[25.1] Fix flaky job search for HDCA inputs on PostgreSQL#22290
Merged
mvdbeek merged 1 commit intogalaxyproject:release_25.1from Mar 29, 2026
Merged
[25.1] Fix flaky job search for HDCA inputs on PostgreSQL#22290mvdbeek merged 1 commit intogalaxyproject:release_25.1from
mvdbeek merged 1 commit intogalaxyproject:release_25.1from
Conversation
The job search HDCA signature comparison was non-deterministic because `func.array_agg(column, order_by=column)` silently drops the `order_by` keyword argument in SQLAlchemy, generating `array_agg(col)` instead of `array_agg(col ORDER BY col)`. This meant both the reference and candidate HDCA signatures were aggregated in whatever scan order PostgreSQL happened to use. When the query planner chose different scan orders for the reference and candidate CTEs (which depends on table statistics and query plan), the resulting arrays had different element orderings, causing the equality comparison to fail — even for the exact same HDCA. The fix uses `aggregate_order_by` from SQLAlchemy's PostgreSQL dialect, which correctly generates `array_agg(col ORDER BY col ASC)`. Diagnostic output from CI confirming the root cause: reference full signature=['data0;251', 'data1;252', 'data2;253'] candidate full signatures=[(75, ['data2;253', 'data1;252', 'data0;251'])] equivalent HDCA ids=[] Same HDCA (id=75), same elements, different array ordering → no match. Investigation details: https://gist.github.com/mvdbeek/a3bd1528be0985e4a7d36e929a502bd2 Fixes galaxyproject#21230
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The job search HDCA signature comparison was non-deterministic because
func.array_agg(column, order_by=column)silently drops theorder_bykeyword argument in SQLAlchemy, generatingarray_agg(col)instead ofarray_agg(col ORDER BY col).This meant both the reference and candidate HDCA signatures were aggregated in whatever scan order PostgreSQL happened to use. When the query planner chose different scan orders for the reference and candidate CTEs (which depends on table statistics and query plan), the resulting arrays had different element orderings, causing the equality comparison to fail — even for the exact same HDCA.
The fix uses
aggregate_order_byfrom SQLAlchemy's PostgreSQL dialect, which correctly generatesarray_agg(col ORDER BY col ASC).Diagnostic output from CI confirming the root cause:
reference full signature=['data0;251', 'data1;252', 'data2;253']
candidate full signatures=[(75, ['data2;253', 'data1;252', 'data0;251'])]
equivalent HDCA ids=[]
Same HDCA (id=75), same elements, different array ordering → no match.
Investigation details: https://gist.github.com/mvdbeek/a3bd1528be0985e4a7d36e929a502bd2
Fixes #21230
(Please replace this header with a description of your pull request. Please include BOTH what you did and why you made the changes. The "why" may simply be citing a relevant Galaxy issue.)
(If fixing a bug, please add any relevant error or traceback)
(For UI components, it is recommended to include screenshots or screencasts)
How to test the changes?
(Select all options that apply)
License