[24.2] Fix various job concurrency limit issues by mvdbeek · Pull Request #19824 · galaxyproject/galaxy

mvdbeek · 2025-03-17T16:54:07Z

I've added an additional check in job_wrapper.enqueue that only updates jobs below the limit. This should be multi-process / multi-thread safe.
The queries are essentially the same queries that are done in JobHandler.__check_user_jobs, JobHandler.__check_destination_jobs etc, but now it's all in in a single update statement.

I suppose performance might be a concern, however we still run through the (cached) checks before we decide to queue the job, so I think the cost is likely minimal. By integrating the limit check in the query i think it should become very unlikely that jobs can bypass limits in a multi handler scenario.

c088f9c fixes a bug where a resubmitted job would cause the cached user_job_count_per_destination / user_job_count values to start at 0.

How to test the changes?

(Select all options that apply)

I've included appropriate automated tests.
This is a refactoring of components with existing test coverage.
Instructions for manual testing are as follows:
1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

These are essentially the same queries that are done in `JobHandler.__check_user_jobs`, `JobHandler.__check_destination_jobs` etc, but now it's all in in a single update statement. I suppose performance might be a concern, however we still run through the (cached) checks before we decide to queue the job, so I think the cost is likely minimal. By integrating the limit check in the query i think it should become very unlikely that jobs can bypass limits in a multi handler scenario.

I assume these are limited on the job level ...

and before incrementing in-memory structures.

mvdbeek · 2025-03-18T14:54:05Z

whoa, all tests ran and are green! that's been a while

mvdbeek · 2025-03-24T15:30:24Z

This is on main now, job loop times seem unaffected, which is good. let's merge this ?

Broken in galaxyproject#19824. We used to add to the job state history in `Job.set_state`.

mvdbeek requested a review from natefoo March 17, 2025 16:54

github-actions Bot added the area/jobs label Mar 17, 2025

mvdbeek added 5 commits March 17, 2025 21:28

Skip limit check for tasks

3c798e7

I assume these are limited on the job level ...

Add unit tests for queue_with_limit

3e767c9

Drop redundant queued state in pulsar

0b302d2

Don't bypass limits in slurm resubmission

1628fff

Rebuild caches after clearing cache

c088f9c

and before incrementing in-memory structures.

mvdbeek force-pushed the fix_limit_bypass branch from 30d6868 to c088f9c Compare March 18, 2025 13:10

mvdbeek changed the title ~~[24.2] Guard state update with limit queries~~ [24.2] Fix various job concurrency limit issues Mar 18, 2025

mvdbeek marked this pull request as ready for review March 18, 2025 13:25

github-actions Bot added this to the 25.0 milestone Mar 18, 2025

mvdbeek requested a review from a team March 18, 2025 14:54

natefoo merged commit ecc4b47 into galaxyproject:release_24.2 Mar 24, 2025

nsoranzo deleted the fix_limit_bypass branch March 24, 2025 23:23

nsoranzo added the kind/bug label Mar 24, 2025

galaxyproject deleted a comment from github-actions Bot Mar 25, 2025

mvdbeek mentioned this pull request Mar 25, 2025

[24.2] Fix tabular metadata setting on pulsar with remote metadata #19891

Merged

4 tasks

mvdbeek added a commit to mvdbeek/galaxy that referenced this pull request Apr 10, 2025

Add missing job state history entry for queued state

9f784cc

Broken in galaxyproject#19824. We used to add to the job state history in `Job.set_state`.

mvdbeek mentioned this pull request Apr 10, 2025

[24.2] Add missing job state history entry for queued state #19977

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[24.2] Fix various job concurrency limit issues#19824

[24.2] Fix various job concurrency limit issues#19824
natefoo merged 6 commits intogalaxyproject:release_24.2from
mvdbeek:fix_limit_bypass

mvdbeek commented Mar 17, 2025 •

edited

Loading

Uh oh!

mvdbeek commented Mar 18, 2025

Uh oh!

mvdbeek commented Mar 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mvdbeek commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to test the changes?

License

Uh oh!

mvdbeek commented Mar 18, 2025

Uh oh!

mvdbeek commented Mar 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mvdbeek commented Mar 17, 2025 •

edited

Loading