Skip to content

Commit 2f6ba64

Browse files
Add new records
Signed-off-by: Christopher Hakkaart <christopher.hakkaart@gmail.com>
1 parent f56889f commit 2f6ba64

9 files changed

Lines changed: 508 additions & 18 deletions

File tree

docs/docs/process-typed.mdx

Lines changed: 128 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ process fastqc {
5454
5555
script:
5656
"""
57-
echo 'meta: ${meta}`
57+
echo 'meta: ${meta}'
5858
echo 'fastq: ${fastq}'
5959
echo 'extra_args: ${extra_args}'
6060
"""
@@ -67,6 +67,8 @@ All [standard types][stdlib-types] except for the dataflow types (`Channel` and
6767

6868
Nextflow automatically stages `Path` inputs and `Path` collections (such as `Set<Path>`) into the task directory.
6969

70+
### Nullable inputs
71+
7072
By default, tasks fail if any input receives a `null` value. To allow `null` values, add `?` to the type annotation:
7173

7274
```nextflow
@@ -75,7 +77,7 @@ process cat_opt {
7577
input: Path?
7678
7779
stage:
78-
stageAs 'input.txt', input
80+
stageAs input, 'input.txt'
7981
8082
output:
8183
stdout()
@@ -87,10 +89,83 @@ process cat_opt {
8789
}
8890
```
8991

90-
### Stage directives
92+
### Record inputs
93+
94+
Inputs with type `Record` can declare the name and type of each record field:
95+
96+
```nextflow
97+
process fastqc {
98+
input:
99+
sample: Record {
100+
id: String
101+
fastq: Path
102+
}
103+
104+
script:
105+
"""
106+
echo 'id: ${sample.id}'
107+
echo 'fastq: ${sample.fastq}'
108+
"""
109+
}
110+
```
111+
112+
In this example, the record is staged into the task as `sample`, and `sample.fastq` is staged as an input file since the `fastq` field is declared with type `Path`.
113+
114+
When the process is invoked, the incoming record should contain the specified fields, or else the run will fail. If the record has additional fields not declared by the process input, they are ignored.
115+
116+
:::tip
117+
Record inputs are a useful way to select a subset of fields from a larger record. This way, the process only stages what it needs, allowing you to keep related data together in your workflow logic.
118+
:::
119+
120+
You can achieve the same behavior using an external record type:
121+
122+
```nextflow
123+
process fastqc {
124+
input:
125+
sample: Sample
126+
127+
script:
128+
"""
129+
echo 'id: ${sample.id}'
130+
echo 'fastq: ${sample.fastq}'
131+
"""
132+
}
133+
134+
record Sample {
135+
id: String
136+
fastq: Path
137+
}
138+
```
139+
140+
This approach is useful when the record type can be re-used elsewhere in the pipeline.
141+
142+
### Tuple inputs
143+
144+
Inputs with type `Tuple` can declare the name of each tuple component:
145+
146+
```nextflow
147+
process fastqc {
148+
input:
149+
(id, fastq): Tuple<String,Path>
150+
151+
script:
152+
"""
153+
echo 'id: ${id}'
154+
echo 'fastq: ${fastq}'
155+
"""
156+
}
157+
```
158+
159+
This pattern is called *tuple destructuring*. Each tuple component is staged into the task the same way as an individual input.
160+
161+
The generic types inside the `Tuple<...>` annotation specify the type of each tuple component and should match the component names. In the above example, `id` has type `String` and `fastq` has type `Path`.
162+
163+
## Stage directives
91164

92165
The `stage:` section defines custom staging behavior using *stage directives*. It should be specified after the `input:` section. These directives serve the same purpose as input qualifiers such as `env` and `stdin` in the legacy syntax.
93166

167+
### Environment variables
168+
94169
The `env` directive declares an environment variable in terms of task inputs:
95170

96171
```nextflow
@@ -108,6 +183,8 @@ process echo_env {
108183
}
109184
```
110185

186+
### Standard input (stdin)
187+
111188
The `stdin` directive defines the standard input of the task script:
112189

113190
```nextflow
@@ -125,6 +202,12 @@ process cat {
125202
}
126203
```
127204

205+
### Custom file staging
206+
207+
<ChangedInVersion version="26.04.0" />
208+
The method signature for `stageAs` was changed from `(filePattern, value)` to `(value, filePattern)`.
209+
</ChangedInVersion>
210+
128211
The `stageAs` directive stages an input file (or files) under a custom file pattern:
129212

130213
```nextflow
@@ -133,7 +216,7 @@ process blast {
133216
fasta: Path
134217
135218
stage:
136-
stageAs 'query.fa', fasta
219+
stageAs fasta, 'query.fa'
137220
138221
script:
139222
"""
@@ -151,7 +234,7 @@ process grep {
151234
fasta: Path
152235
153236
stage:
154-
stageAs "${id}.fa", fasta
237+
stageAs fasta, "${id}.fa"
155238
156239
script:
157240
"""
@@ -224,6 +307,46 @@ process foo {
224307
}
225308
```
226309

310+
### Structured outputs
311+
312+
Whereas legacy process outputs could only be structured using specific qualifiers like `val` and `tuple`, typed process outputs are regular values.
313+
314+
The `record()` standard library function can be used to create a record:
315+
316+
```nextflow
317+
process fastqc {
318+
input:
319+
sample: Record {
320+
id: String
321+
fastq: Path
322+
}
323+
324+
output:
325+
record(
326+
id: sample.id,
327+
fastqc: file('fastqc_logs')
328+
)
329+
330+
script:
331+
// ...
332+
}
333+
```
334+
335+
The `tuple()` standard library function can be used to create a tuple:
336+
337+
```nextflow
338+
process fastqc {
339+
input:
340+
(id, fastq): Tuple<String,Path>
341+
342+
output:
343+
tuple(id, file('fastqc_logs'))
344+
345+
script:
346+
// ...
347+
}
348+
```
349+
227350
## Topics
228351

229352
The `topic:` section emits values to [topic channels][channel-topic]. A topic emission consists of an output value and a topic name:

docs/docs/reference/config.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1747,6 +1747,10 @@ The following settings are available:
17471747

17481748
The Seqera scheduler service endpoint URL (required).
17491749

1750+
###### `seqera.executor.provider`
1751+
1752+
The compute backend provider type (e.g. `'aws'`, `'local'`). When specified, used together with `region` to select the matching compute environment.
1753+
17501754
###### `seqera.executor.region`
17511755

17521756
The AWS region for task execution (default: `'eu-central-1'`).

docs/docs/reference/process.mdx

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -87,10 +87,11 @@ The following directives can be used in the `stage:` section of a typed process:
8787

8888
Declares an environment variable with the specified name and value in the task environment.
8989

90-
###### `stageAs( filePattern: String, value: Path )`
90+
###### `stageAs( value: Path, filePattern: String )`
9191

92-
Stages a file into the task directory under the given alias.
93-
##### `stageAs( filePattern: String, value: Iterable<Path> )`
92+
Stages a file into the task directory under the given alias.
93+
94+
##### `stageAs( value: Iterable<Path>, filePattern: String )`
9495

9596
Stages a collection of files into the task directory under the given alias.
9697

docs/docs/reference/stdlib-namespaces.mdx

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ Create a branch criteria to use with the [branch][operator-branch] operator.
5252

5353
##### `env( name: String ) -> String`
5454

55-
<AddedInVersion version="24.11.0-edge" />
55+
<AddedInVersion version="25.04.0" />
5656

5757
Get the value of the environment variable with the specified name in the Nextflow launch environment.
5858

@@ -133,9 +133,13 @@ Send an email. See [Notifications][mail-page] for more information.
133133

134134
Sleep for the given number of milliseconds.
135135

136+
##### `record( [options] ) -> Record`
137+
138+
Create a record from the given named arguments.
139+
136140
##### `tuple( collection: List ) -> ArrayTuple`
137141

138-
Create a tuple object from the given arguments.
142+
Create a tuple from the given arguments.
139143

140144
## `channel`
141145

docs/docs/reference/stdlib-types.mdx

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -918,6 +918,35 @@ Splits a JSON file into a list of records. See the [operator-splitjson][operator
918918

919919
Splits a text file into a list of lines. See the [operator-splittext][operator-splittext] operator for available options.
920920

921+
## Record
922+
923+
A record is an immutable map of fields to values (i.e., `Map<String,?>`). Each value can have its own type.
924+
925+
A record can be created using the `record` function:
926+
927+
```nextflow
928+
sample = record(id: '1', fastq_1: file('1_1.fastq'), fastq_2: file('1_2.fastq'))
929+
```
930+
931+
Record fields can be accessed as properties:
932+
933+
```nextflow
934+
sample.id
935+
// -> '1'
936+
```
937+
938+
The following operations are supported for records:
939+
940+
##### `+ : (Record, Record) -> Record`
941+
942+
Given two records, returns a new record containing the fields and values of both records. When a field is present in both records, the value of the right-hand record takes precedence.
943+
944+
The following methods are available for a record:
945+
946+
###### `subMap( keys: Iterable<String> ) -> Record`
947+
948+
Returns a new record containing only the given fields.
949+
921950
## Set\<E\>
922951

923952
*Implements the [stdlib-types-iterable][stdlib-types-iterable] trait.*

docs/docs/reference/syntax.mdx

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ A Nextflow script may contain the following top-level declarations:
3737
- Process definitions
3838
- Function definitions
3939
- Enum types
40+
- Record types
4041
- Output block
4142

4243
Script declarations are in turn composed of statements and expressions.
@@ -110,6 +111,8 @@ The following definitions can be included:
110111
- Functions
111112
- Processes
112113
- Named workflows
114+
- *New in 26.04:* Enum types
115+
- *New in 26.04:* Record types
113116

114117
### Params block
115118

@@ -359,9 +362,17 @@ enum Day {
359362

360363
Enum values in the above example can be accessed as `Day.MONDAY`, `Day.TUESDAY`, and so on.
361364

362-
:::note
363-
Enum types cannot be included across modules at this time.
364-
:::
365+
### Record type
366+
367+
A record type declaration consists of a name and a body. The body consists of one or more fields, where each field has a name and a type:
368+
369+
```nextflow
370+
record FastqPair {
371+
id: String
372+
fastq_1: Path
373+
fastq_2: Path
374+
}
375+
```
365376

366377
### Output block
367378

docs/docs/script.mdx

Lines changed: 32 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -110,26 +110,54 @@ Copying a map with the `+` operator is a safer way to modify maps in Nextflow, s
110110

111111
See [Map\<K,V\>][stdlib-types-map] for the set of available map operations.
112112

113+
## Records
114+
115+
Records are used to store a set of related fields, where each field can have its own type. They are created using the `record` function:
116+
117+
```nextflow
118+
person = record(name: 'Alice', age: 42, is_alive: true)
119+
```
120+
121+
Record fields are accessed by name:
122+
123+
```nextflow
124+
name = person.name
125+
age = person.age
126+
is_alive = person.is_alive
127+
```
128+
129+
Records are immutable -- once a record is created, it cannot be modified. Use record operations to create new records instead.
130+
131+
For example:
132+
133+
```nextflow
134+
person + record(age: 43) - ['is_alive']
135+
136+
// record(name: 'Alice', age: 43)
137+
```
138+
139+
See {ref}`stdlib-types-record` for the set of available record operations.
140+
113141
## Tuples
114142

115143
Tuples are used to store a fixed sequence of heterogeneous values. They are created using the `tuple` function:
116144

117145
```nextflow
118-
person = tuple('Alice', 42, false)
146+
person = tuple('Alice', 42, true)
119147
```
120148

121149
Tuple elements are accessed by index:
122150

123151
```nextflow
124152
name = person[0]
125153
age = person[1]
126-
is_male = person[2]
154+
is_alive = person[2]
127155
```
128156

129157
Tuples can be destructured in assignments:
130158

131159
```nextflow
132-
(name, age, is_male) = person
160+
(name, age, is_alive) = person
133161
```
134162

135163
As well as closure parameters:
@@ -526,6 +554,7 @@ See [Workflows][workflow-page], [Processes][process-page], and [Modules][module-
526554
[process-page]: ./process
527555
[stdlib-page]: ./reference/stdlib
528556
[stdlib-types-list]: ./reference/stdlib-types#liste
557+
[stdlib-types-record]: ./reference/stdlib-types#record
529558
[stdlib-types-map]: ./reference/stdlib-types#mapkv
530559
[stdlib-types-tuple]: ./reference/stdlib-types#tuple
531560
[syntax-page]: ./reference/syntax

0 commit comments

Comments
 (0)