Skip to content

[25.1] Fix flaky job search for HDCA inputs on PostgreSQL#22290

Merged
mvdbeek merged 1 commit intogalaxyproject:release_25.1from
mvdbeek:fix_job_search_hdca_25.1
Mar 29, 2026
Merged

[25.1] Fix flaky job search for HDCA inputs on PostgreSQL#22290
mvdbeek merged 1 commit intogalaxyproject:release_25.1from
mvdbeek:fix_job_search_hdca_25.1

Conversation

@mvdbeek
Copy link
Copy Markdown
Member

@mvdbeek mvdbeek commented Mar 28, 2026

The job search HDCA signature comparison was non-deterministic because func.array_agg(column, order_by=column) silently drops the order_by keyword argument in SQLAlchemy, generating array_agg(col) instead of array_agg(col ORDER BY col).

This meant both the reference and candidate HDCA signatures were aggregated in whatever scan order PostgreSQL happened to use. When the query planner chose different scan orders for the reference and candidate CTEs (which depends on table statistics and query plan), the resulting arrays had different element orderings, causing the equality comparison to fail — even for the exact same HDCA.

The fix uses aggregate_order_by from SQLAlchemy's PostgreSQL dialect, which correctly generates array_agg(col ORDER BY col ASC).

Diagnostic output from CI confirming the root cause:

reference full signature=['data0;251', 'data1;252', 'data2;253']
candidate full signatures=[(75, ['data2;253', 'data1;252', 'data0;251'])]
equivalent HDCA ids=[]

Same HDCA (id=75), same elements, different array ordering → no match.

Investigation details: https://gist.github.com/mvdbeek/a3bd1528be0985e4a7d36e929a502bd2

Fixes #21230

(Please replace this header with a description of your pull request. Please include BOTH what you did and why you made the changes. The "why" may simply be citing a relevant Galaxy issue.)
(If fixing a bug, please add any relevant error or traceback)
(For UI components, it is recommended to include screenshots or screencasts)

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

The job search HDCA signature comparison was non-deterministic because
`func.array_agg(column, order_by=column)` silently drops the `order_by`
keyword argument in SQLAlchemy, generating `array_agg(col)` instead of
`array_agg(col ORDER BY col)`.

This meant both the reference and candidate HDCA signatures were
aggregated in whatever scan order PostgreSQL happened to use. When the
query planner chose different scan orders for the reference and
candidate CTEs (which depends on table statistics and query plan), the
resulting arrays had different element orderings, causing the equality
comparison to fail — even for the exact same HDCA.

The fix uses `aggregate_order_by` from SQLAlchemy's PostgreSQL dialect,
which correctly generates `array_agg(col ORDER BY col ASC)`.

Diagnostic output from CI confirming the root cause:

  reference full signature=['data0;251', 'data1;252', 'data2;253']
  candidate full signatures=[(75, ['data2;253', 'data1;252', 'data0;251'])]
  equivalent HDCA ids=[]

Same HDCA (id=75), same elements, different array ordering → no match.

Investigation details: https://gist.github.com/mvdbeek/a3bd1528be0985e4a7d36e929a502bd2

Fixes galaxyproject#21230
@mvdbeek mvdbeek added kind/bug area/database Galaxy's database or data access layer labels Mar 28, 2026
@github-actions github-actions Bot added this to the 26.1 milestone Mar 28, 2026
Copy link
Copy Markdown
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, nice catch!

@mvdbeek mvdbeek merged commit 1ea3db2 into galaxyproject:release_25.1 Mar 29, 2026
50 of 54 checks passed
@github-project-automation github-project-automation Bot moved this from Needs Review to Done in Galaxy Dev - weeklies Mar 29, 2026
@ahmedhamidawan ahmedhamidawan modified the milestones: 26.1, 26.0 Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/database Galaxy's database or data access layer kind/bug

Projects

Development

Successfully merging this pull request may close these issues.

3 participants