ahmedhamidawan
diff --git a/‎doc/source/admin/galaxy_options.rst‎
Lines changed: 15 additions & 0 deletions b/‎doc/source/admin/galaxy_options.rst‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎doc/source/admin/production.md‎
Lines changed: 163 additions & 0 deletions b/‎doc/source/admin/production.md‎
Lines changed: 163 additions & 0 deletions
diff --git a/‎lib/galaxy/app.py‎
Lines changed: 44 additions & 8 deletions b/‎lib/galaxy/app.py‎
Lines changed: 44 additions & 8 deletions
diff --git a/‎lib/galaxy/celery/__init__.py‎
Lines changed: 21 additions & 3 deletions b/‎lib/galaxy/celery/__init__.py‎
Lines changed: 21 additions & 3 deletions
@@ -5451,6 +5451,21 @@
 :Type: float
 
 
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+``celery_user_concurrency_limit``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+:Description:
+    Maximum number of Celery tasks that can execute concurrently for a
+    single user. If set to 0 (default), no concurrency limit is
+    enforced. When a user exceeds this limit, new tasks are deferred
+    and retried until a slot becomes available. A periodic cleanup
+    task reclaims slots from crashed workers by inspecting active
+    tasks on all workers.
+:Default: ``0``
+:Type: int
+
+
 ~~~~~~~~~~~~~~
 ``use_pbkdf2``
 ~~~~~~~~~~~~~~
 
@@ -262,3 +262,166 @@ This configuration ensures that:
 2. Exported files remain available for 7 days after generation
 
 You can monitor Celery task status using [Flower](https://flower.readthedocs.io/en/latest/), a real-time web-based monitoring tool for Celery.
+
+#### Per-user task rate limiting
+
+Galaxy supports limiting the rate at which Celery tasks are executed per user. This prevents a single user from monopolizing worker capacity and ensures fair scheduling across all users.
+
+##### Configuration
+
+Set `celery_user_rate_limit` in the Galaxy configuration to a non-zero float representing the maximum number of tasks per user per second. For example:
+
+```yaml
+celery_user_rate_limit: 0.1
+```
+
+This allows each user at most one task execution every 10 seconds. The default value of `0.0` disables rate limiting entirely.
+
+##### How it works
+
+Rate limiting is implemented in Celery's `before_start` hook via the `GalaxyTaskBeforeStart` class hierarchy. The mechanism works in two phases:
+
+1. **Slot reservation (first attempt):** When a task is about to execute for the first time, Galaxy atomically reserves the next available timeslot for that user in the `celery_user_rate_limit` database table. The reserved time is calculated as `max(last_scheduled_time + interval, now)`. If the reserved timeslot is in the future, the task is deferred using `task.retry(countdown=...)` and the reserved time is stored in a Celery message header.
+
+2. **Execution gating (retries):** On each subsequent retry, the task reads its reserved timeslot from the message header and checks whether the current time has reached it. If not, it retries again with the remaining countdown. Once the timeslot is reached, the task proceeds to execute.
+
+This two-phase design ensures that:
+- Each task reserves a slot in the DB exactly once, avoiding cascading delays from re-reservation.
+- Tasks from different users are scheduled independently — user A's tasks do not delay user B's tasks.
+- Tasks are retried with `max_retries=None` for rate-limit retries, so they are never lost due to Celery's default retry limit.
+
+The flow for a single task looks like this:
+
+```
+Task arrives (retries=0, no header)
+  → Atomically reserve timeslot in DB
+  → Timeslot is now? Execute immediately.
+  → Timeslot is in the future?
+      → Store timeslot in message header
+      → task.retry(countdown=timeslot - now)
+
+Task wakes up (retry, header present)
+  → Read reserved timeslot from header
+  → now >= timeslot? Execute.
+  → now < timeslot? retry(countdown=remaining)
+```
+
+##### Database backends
+
+Galaxy provides two implementations of the timeslot reservation logic:
+
+- **PostgreSQL** (`GalaxyTaskBeforeStartUserRateLimitPostgres`): Uses `UPDATE ... RETURNING` with `greatest()` for an atomic, single-statement slot reservation. Falls back to `INSERT ... ON CONFLICT DO UPDATE` (upsert) for the first task by a given user. This is the most efficient implementation.
+
+- **Standard SQL** (`GalaxyTaskBeforeStartUserRateLimitStandard`): Uses `SELECT ... FOR UPDATE` followed by a separate `UPDATE` (or `INSERT` for new users). This works with SQLite and other databases but requires two statements and explicit locking.
+
+The correct implementation is selected automatically based on the configured `database_connection`.
+
+##### Limitations
+
+- **Rate limiting, not concurrency limiting.** The mechanism controls the *rate* at which tasks are scheduled (tasks per second per user), not how many tasks run concurrently. If a user submits 100 tasks, they will all eventually execute — just spaced apart by the configured interval.
+- **Tasks without `task_user_id` are not rate-limited.** Only tasks that receive a `task_user_id` keyword argument participate in rate limiting. System tasks and tasks without a user context bypass the check entirely.
+- **Timeslots are not released on failure.** If a task fails after its timeslot was reserved, that slot is consumed. The next task for the same user will be scheduled after it. This means task failures still "use up" rate-limit capacity.
+- **Clock precision.** The mechanism relies on `datetime.datetime.now()` on the worker. Clock skew between workers could cause minor scheduling inaccuracies, though this is unlikely to matter in practice.
+- **No priority or reordering.** Tasks are scheduled in the order they reserve slots (first-come, first-served within a user). There is no mechanism to prioritize certain task types over others for the same user.
+- **Worker restarts.** If a worker is terminated while holding deferred tasks, Celery's broker (e.g., Redis, RabbitMQ) will redeliver them. The tasks will re-enter the `before_start` hook, read their reserved timeslot from the message header, and continue waiting or execute as appropriate — no slots are lost or duplicated.
+- **Database overhead.** Each task execution requires one or two queries to the `celery_user_rate_limit` table to reserve a timeslot (one for PostgreSQL's atomic upsert, two for the standard `SELECT FOR UPDATE` + `UPDATE` path). On PostgreSQL at 100 tasks/second this adds ~100 small writes/second; the standard backend doubles that. For most Galaxy deployments (typically fewer than 10 tasks/second) this is negligible. Additionally, tasks that are deferred via `task.retry` re-enter the broker and are redelivered to a worker, adding a small amount of broker traffic proportional to the deferral rate.
+
+#### Per-user task concurrency limiting
+
+In addition to rate limiting, Galaxy supports limiting the number of tasks that can execute **concurrently** for a single user. This prevents one user from consuming all available worker capacity.
+
+##### Configuration
+
+Set `celery_user_concurrency_limit` in the Galaxy configuration to the maximum number of tasks that can run simultaneously per user:
+
+```yaml
+celery_user_concurrency_limit: 5
+```
+
+The default value of `0` disables concurrency limiting. This setting can be used independently of or in combination with `celery_user_rate_limit`.
+
+##### How it works
+
+Concurrency limiting uses a tracking table (`celery_user_active_task`) that records which tasks are currently executing for each user. The mechanism has three components:
+
+1. **Admission control (`before_start`):** Before a task executes, the system counts the user's currently active tasks in the tracking table. If the count is at or above the limit, the task is deferred via `task.retry(countdown=5)` with unlimited retries. Otherwise, a tracking row is inserted for this task.
+
+2. **Cleanup on completion (`after_return`):** When a task finishes (success or failure), its tracking row is deleted from the table. This runs via Celery's `after_return` hook, which fires regardless of whether the task succeeded or failed. Retries do not trigger cleanup — only final completion does.
+
+3. **Stale row recovery (periodic beat task):** A periodic task (`cleanup_stale_concurrency_slots`) runs every 5 minutes to handle the case where a worker crashes without calling `after_return`. It queries all workers via `celery_app.control.inspect().active()` to get the set of actually-running task IDs, then removes any tracking rows older than 30 minutes whose task ID is not found on any worker.
+
+The flow for a single task:
+
+```
+Task arrives
+  → Count active tasks for this user in DB
+  → Count >= limit? → task.retry(countdown=5)
+  → Count < limit?
+      → INSERT tracking row (task_id, user_id, started_at)
+      → Execute task
+      → Task finishes (success or failure)
+          → DELETE tracking row
+
+Periodic cleanup (every 5 min)
+  → SELECT stale rows (started_at > 30 min ago)
+  → inspect().active() → get all running task IDs from workers
+  → DELETE rows where task_id NOT in active set
+```
+
+##### Combining rate limiting and concurrency limiting
+
+When both `celery_user_rate_limit` and `celery_user_concurrency_limit` are set, rate limiting runs first (to schedule the timeslot) and concurrency limiting runs second (to gate execution based on active task count). This means a task must both reach its scheduled timeslot *and* have a concurrency slot available before it can execute.
+
+##### Limitations
+
+- **Tasks without `task_user_id` are not limited.** Only tasks that receive a `task_user_id` keyword argument participate in concurrency limiting.
+- **Worker crash recovery is not instant.** If a worker is killed (SIGKILL, OOM), its tracking rows remain until the periodic cleanup task runs (every 5 minutes by default). During this window, those slots are "leaked" and reduce the user's effective concurrency limit.
+- **Retry polling interval is fixed.** Deferred tasks retry every 5 seconds. Under heavy load with many deferred tasks, this creates periodic bursts of retry attempts.
+- **No queue ordering guarantees.** When multiple tasks are waiting for a concurrency slot, the order in which they acquire slots depends on Celery's delivery order and retry timing — not submission order.
+- **Database overhead.** Each task execution requires an INSERT (on start) and DELETE (on completion) in the tracking table, plus a COUNT query for admission. At 100 tasks/second this adds ~300 small queries/second to the database. For most Galaxy deployments (which typically sustain fewer than 10 tasks/second) this is negligible. Deployments processing hundreds of tasks per second should monitor database connection pool utilization and query latency on the `celery_user_active_task` table.
+
+##### Administrative operations
+
+Admins can directly manage the concurrency tracking table and the Celery queue to recover from stuck states or clear backlogs.
+
+**Clearing leaked concurrency slots manually:**
+
+If a worker crashes and the periodic cleanup hasn't run yet (or Celery beat is not running), admins can free slots directly:
+
+```sql
+-- View all currently tracked active tasks
+SELECT * FROM celery_user_active_task ORDER BY started_at;
+
+-- Remove all slots for a specific user (e.g., user_id 42)
+DELETE FROM celery_user_active_task WHERE user_id = 42;
+
+-- Remove all stale slots older than 1 hour
+DELETE FROM celery_user_active_task
+WHERE started_at < NOW() - INTERVAL '1 hour';
+
+-- Nuclear option: clear ALL tracking rows (resets all concurrency counters)
+DELETE FROM celery_user_active_task;
+```
+
+After clearing rows, deferred tasks waiting for slots will acquire them on their next retry (within 5 seconds).
+
+**Purging tasks from the Celery queue:**
+
+To remove pending (not yet started) tasks from the broker queue:
+
+```bash
+# Purge all pending tasks from the default Galaxy queue
+celery -A galaxy.celery purge -Q galaxy.internal
+
+# Purge all pending tasks from all queues
+celery -A galaxy.celery purge
+
+# Revoke a specific task by ID (prevents it from executing even if already delivered)
+celery -A galaxy.celery call celery.backend_cleanup  # or use the control interface:
+celery -A galaxy.celery control revoke <task-id>
+
+# Revoke all pending tasks for inspection first
+celery -A galaxy.celery inspect reserved
+```
+
+Note: `purge` only removes tasks that have not yet been delivered to a worker. Tasks already being executed or waiting in a worker's prefetch buffer require `revoke`. Revoking a task that is mid-execution requires the `--terminate` flag, which sends SIGTERM to the worker process — use with caution.
@@ -29,7 +29,12 @@
 from galaxy.agents.registry import AgentRegistry
 from galaxy.carbon_emissions import get_carbon_intensity_entry
 from galaxy.celery.base_task import (
+    GalaxyTaskAfterReturn,
+    GalaxyTaskAfterReturnConcurrencyLimit,
     GalaxyTaskBeforeStart,
+    GalaxyTaskBeforeStartCombined,
+    GalaxyTaskBeforeStartConcurrencyLimitPostgres,
+    GalaxyTaskBeforeStartConcurrencyLimitStandard,
     GalaxyTaskBeforeStartUserRateLimitPostgres,
     GalaxyTaskBeforeStartUserRateLimitStandard,
 )
@@ -709,24 +714,55 @@ def __init__(self, configure_logging=True, use_converters=True, use_display_appl
 
     def _register_celery_galaxy_task_components(self):
         """
-        Register subtype class instance to support implementation of a user rate limit for execution of celery tasks.
-        The default supertype class does not enforce a user rate limit. This is the case if the celery_user_rate_limit
-        config param is the default value.
+        Register subtype class instances for user rate limiting and concurrency
+        limiting of celery task executions. Both can be enabled independently
+        or together (combined via GalaxyTaskBeforeStartCombined).
         """
-        task_before_start: GalaxyTaskBeforeStart
+        hooks: list[GalaxyTaskBeforeStart] = []
+
+        # Rate limiting hook
         if self.config.celery_user_rate_limit:
             if is_postgres(self.config.database_connection):  # type: ignore[arg-type]
-                task_before_start = GalaxyTaskBeforeStartUserRateLimitPostgres(
-                    self.config.celery_user_rate_limit, self.model.session
+                hooks.append(
+                    GalaxyTaskBeforeStartUserRateLimitPostgres(self.config.celery_user_rate_limit, self.model.session)
                 )
             else:
-                task_before_start = GalaxyTaskBeforeStartUserRateLimitStandard(
-                    self.config.celery_user_rate_limit, self.model.session
+                hooks.append(
+                    GalaxyTaskBeforeStartUserRateLimitStandard(self.config.celery_user_rate_limit, self.model.session)
                 )
+
+        # Concurrency limiting hook
+        if self.config.celery_user_concurrency_limit:
+            if is_postgres(self.config.database_connection):  # type: ignore[arg-type]
+                hooks.append(
+                    GalaxyTaskBeforeStartConcurrencyLimitPostgres(
+                        self.config.celery_user_concurrency_limit, self.model.session
+                    )
+                )
+            else:
+                hooks.append(
+                    GalaxyTaskBeforeStartConcurrencyLimitStandard(
+                        self.config.celery_user_concurrency_limit, self.model.session
+                    )
+                )
+
+        # Register the appropriate before_start hook
+        if len(hooks) > 1:
+            task_before_start: GalaxyTaskBeforeStart = GalaxyTaskBeforeStartCombined(*hooks)
+        elif len(hooks) == 1:
+            task_before_start = hooks[0]
         else:
             task_before_start = GalaxyTaskBeforeStart()
         self._register_singleton(GalaxyTaskBeforeStart, task_before_start)
 
+        # Register after_return hook for concurrency tracking cleanup
+        task_after_return: GalaxyTaskAfterReturn
+        if self.config.celery_user_concurrency_limit:
+            task_after_return = GalaxyTaskAfterReturnConcurrencyLimit(self.model.session)
+        else:
+            task_after_return = GalaxyTaskAfterReturn()
+        self._register_singleton(GalaxyTaskAfterReturn, task_after_return)
+
     def _configure_tool_shed_registry(self) -> None:
         # Set up the tool sheds registry
         if os.path.isfile(self.config.tool_sheds_config_file):
 
@@ -23,7 +23,10 @@
 )
 from kombu import serialization
 
-from galaxy.celery.base_task import GalaxyTaskBeforeStart
+from galaxy.celery.base_task import (
+    GalaxyTaskAfterReturn,
+    GalaxyTaskBeforeStart,
+)
 from galaxy.config import Configuration
 from galaxy.main_config import find_config
 from galaxy.util import ExecutionTimer
@@ -71,8 +74,8 @@ def trim_module_name(self, module):
 
 class GalaxyTask(Task):
     """
-    Custom celery task used to limit number of tasks executions per user
-    per second.
+    Custom celery task used to enforce per-user rate limits and
+    concurrency limits on task executions.
     """
 
     def before_start(self, task_id, args, kwargs):
@@ -83,6 +86,17 @@ def before_start(self, task_id, args, kwargs):
         assert app
         app[GalaxyTaskBeforeStart](self, task_id, args, kwargs)
 
+    def after_return(self, status, retval, task_id, args, kwargs, einfo):
+        """
+        Called after task returns (success, failure, revoked, or retry).
+        Used to clean up concurrency tracking rows.
+        """
+        if status == "RETRY":
+            return  # Don't clean up on retry — the task will run again
+        app = get_galaxy_app()
+        if app:
+            app[GalaxyTaskAfterReturn](self, task_id, args, kwargs)
+
 
 def set_thread_app(app):
     APP_LOCAL.app = app
@@ -251,6 +265,10 @@ def schedule_task(task, interval):
     if config.vault_token_renewal_interval:
         schedule_task("renew_vault_token", config.vault_token_renewal_interval)
 
+    if config.celery_user_concurrency_limit:
+        # Run cleanup every 5 minutes (300 seconds)
+        schedule_task("cleanup_stale_concurrency_slots", 300)
+
     if beat_schedule:
         celery_app.conf.beat_schedule = beat_schedule