Skip to content

Commit acf6d95

Browse files
authored
Merge pull request #6960 from nextflow-io/adr/hints-process-directive
2 parents 779974a + 07feada commit acf6d95

1 file changed

Lines changed: 180 additions & 0 deletions

File tree

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
# `hints` process directive for executor-specific scheduling hints
2+
3+
- Authors: Rob Syme
4+
- Status: accepted
5+
- Deciders: Paolo Di Tommaso, Ben Sherman, Rob Syme
6+
- Date: 2026-03-23
7+
- Tags: directive, executor, scheduling
8+
9+
## Summary
10+
11+
Introduce a `hints` process directive for executor-specific scheduling hints that don't map to existing directives.
12+
13+
## Problem Statement
14+
15+
Many executors can be configured in various ways on a per-task basis. For example:
16+
17+
- AWS Batch jobs can use *consumable resources* to limit concurrent job execution based on non-standard resources such as software license seats.
18+
19+
- Google Batch jobs can specify a *provisioning model* to control the use of spot vs on-demand VMs on a per-task basis.
20+
21+
- Seqera Scheduler supports a variety of resource and scheduling settings, including spot/on-demand provisioning.
22+
23+
These settings can be exposed by Nextflow as executor-specific config options, such as `google.batch.spot`, but config options are applied globally. In order to apply a setting to specific processes or tasks, it must be exposed as a process directive.
24+
25+
Process directives in Nextflow aim to provide a common vocabulary for executing tasks in many different environments. Directives such as `cpus`, `memory`, and `time` have broadly the same meaning across most executors, making it easier for users to write portable pipelines.
26+
27+
At the same time, many executors have custom settings not shared by other executors, and it is not practical to create a new process directive for every new setting. There are over 40 [process directives](https://docs.seqera.io/nextflow/reference/process#directives) at the time of writing, and every new directive adds cognitive load when a user is trying to find the right directive for a given situation.
28+
29+
There exist a few generic process directives already:
30+
31+
- The `clusterOptions` directive can be used to specify command-line arguments, primarily for HPC schedulers
32+
- The `ext` directive supports arbitrary key-values, but is designed primarily to customize the task script (e.g. tool arguments), not executor behavior
33+
- The `resourceLabels` directive also supports arbitrary key-values, but is intended for tagging and tracking resources, not controlling them
34+
35+
A new directive is needed to support executor-specific settings at a per-task level in a structured manner, without bloating the process directives for every new custom setting.
36+
37+
## Goals
38+
39+
- Provide a way to apply executor-specific settings to individual processes or tasks
40+
41+
- Avoid the proliferation of narrow, executor-specific directives (e.g. `consumableResources`, `schedulingPolicy`, etc.)
42+
43+
- Provide a single extension point that executors can consume selectively
44+
45+
- Allow settings to be specified as key-values, providing validation where possible
46+
47+
## Non-goals
48+
49+
- Replacing existing directives (`cpus`, `memory`, `accelerator`, `queue`) — those remain the right place for standard resources
50+
51+
## Decision
52+
53+
Introduce a `hints` process directive with namespaced keys. Executors consume the hints they understand and silently ignore the rest.
54+
55+
## Core Capabilities
56+
57+
### Syntax
58+
59+
The `hints` directive accepts a map of key-value pairs:
60+
61+
```groovy
62+
// process definition
63+
process runDragen {
64+
cpus 4
65+
memory '16 GB'
66+
hints consumableResources: 'my-dragen-license=1,other-license=2'
67+
68+
script:
69+
"""
70+
dragen --ref-dir /ref ...
71+
"""
72+
}
73+
```
74+
75+
```groovy
76+
// process config
77+
process {
78+
withName: 'runDragen' {
79+
hints = [
80+
consumableResources: 'my-dragen-license=1,other-license=2'
81+
]
82+
}
83+
}
84+
```
85+
86+
Both keys and values are arbitrary strings. Executors are responsible for defining which hints they recognize, as well as the expected structure for a given hint value. This approach keeps the `hints` directive simple (`Map<String,String>`) while allowing executors to structure hint values however they want (as long as it's a string).
87+
88+
In the above example, the `consumableResources` hint is given as a comma-separated string of `<name>=<count>` entries. The AWS Batch executor would parse this string into a map and supply it to each job request using `ConsumableResourceProperties`.
89+
90+
### Namespacing
91+
92+
Keys can use dot-separated scopes to namespace settings as needed:
93+
94+
```groovy
95+
hints consumableResources: 'my-dragen-license=1'
96+
hints 'scheduling.priority': 10
97+
hints 'scheduling.provisioningModel': 'spot'
98+
```
99+
100+
Keys can be routed to specific executors by prefixing with the executor name and a slash (`/`):
101+
102+
```groovy
103+
hints 'awsbatch/consumableResources': 'my-dragen-license'
104+
hints 'seqera/scheduling.provisioningModel': 'spot'
105+
hints 'k8s/nodeSelector': 'gpu=true'
106+
```
107+
108+
The executor prefix gives pipeline developers the ability to target specific executors and have assurance that it won't accidentally apply to other executors (e.g. if another executor adds support for the same hint in the future).
109+
110+
### Validation
111+
112+
Nextflow should validate hints to the best of its ability, to catch errors such as typos:
113+
114+
- **Prefixed hints** can be validated against the set of hints declared by the corresponding executor. Unrecognized hints should be reported as errors.
115+
116+
- **Unprefixed hints** can be validated against the union of hints declared by all executors. Since unprefixed hints might be supported by executors that aren't currently loaded, unrecognized hints should be reported as warnings.
117+
118+
### Multiple hint resolution
119+
120+
The `hints` directive uses *replacement semantics* when specified multiple times, meaning that each `hints` setting completely replaces any previous settings:
121+
122+
```groovy
123+
process {
124+
// generic hint
125+
hints = [provisioningModel: 'spot']
126+
127+
// specific hint replaces generic hint
128+
withLabel: 'dragen' {
129+
hints = [consumableResources: 'my-dragen-license=1']
130+
}
131+
}
132+
```
133+
134+
Within a process definition, the `hints` directive uses *accumulation semantics*, meaning that subsequent `hints` directives are accumulated:
135+
136+
```groovy
137+
process runDragen {
138+
// multiple separate hints
139+
hints provisioningModel: 'spot'
140+
hints consumableResources: 'my-dragen-license=1,other-license=2'
141+
142+
// equivalent to...
143+
hints (
144+
provisioningModel: 'spot',
145+
consumableResources: 'my-dragen-license=1,other-license=2'
146+
)
147+
148+
// ...
149+
}
150+
```
151+
152+
This behavior is consistent with other directives such as `pod` and `resourceLabels`. In practice, this means that a given `hints` setting should specify all relevant hints for the given context.
153+
154+
For example, the `withLabel` selector above should also specify the `provisioningModel` hint if the intention is to preserve that hint for the selected processes:
155+
156+
```groovy
157+
process {
158+
hints = [provisioningModel: 'spot']
159+
160+
withLabel: 'dragen' {
161+
hints = [provisioningModel: 'spot', consumableResources: 'my-dragen-license=1']
162+
}
163+
}
164+
```
165+
166+
While this approach may lead to duplication, it gives users and developers more control over which hints are applied in a given context.
167+
168+
### Initial hint catalog
169+
170+
The following hints should be supported initially:
171+
172+
| Hint name | Executors | Use case |
173+
|--|--|--|
174+
| `consumableResources` | AWS Batch | License-aware scheduling ([#5917](https://github.com/nextflow-io/nextflow/issues/5917)) |
175+
| `scheduling.priority` | AWS Batch | Job scheduling priority ([#6998](https://github.com/nextflow-io/nextflow/issues/6998)) |
176+
| `scheduling.provisioningModel` | Google Batch | Spot VM scheduling ([#3530](https://github.com/nextflow-io/nextflow/issues/3530)) |
177+
178+
## Links
179+
180+
- [Community issue](https://github.com/nextflow-io/nextflow/issues/5917)

0 commit comments

Comments
 (0)