Skip to content

Commit ece68a9

Browse files
committed
Merge branch 'master' into seqera-compute-env-id
2 parents fc1d429 + 021c77c commit ece68a9

63 files changed

Lines changed: 2449 additions & 218 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

adr/20260306-record-types.md

Lines changed: 435 additions & 0 deletions
Large diffs are not rendered by default.

docs/process-typed.md

Lines changed: 128 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ process fastqc {
5252
5353
script:
5454
"""
55-
echo 'meta: ${meta}`
55+
echo 'meta: ${meta}'
5656
echo 'fastq: ${fastq}'
5757
echo 'extra_args: ${extra_args}'
5858
"""
@@ -65,6 +65,8 @@ All {ref}`standard types <stdlib-types>` except for the dataflow types (`Channel
6565

6666
Nextflow automatically stages `Path` inputs and `Path` collections (such as `Set<Path>`) into the task directory.
6767

68+
### Nullable inputs
69+
6870
By default, tasks fail if any input receives a `null` value. To allow `null` values, add `?` to the type annotation:
6971

7072
```nextflow
@@ -73,7 +75,7 @@ process cat_opt {
7375
input: Path?
7476
7577
stage:
76-
stageAs 'input.txt', input
78+
stageAs input, 'input.txt'
7779
7880
output:
7981
stdout()
@@ -85,10 +87,83 @@ process cat_opt {
8587
}
8688
```
8789

88-
### Stage directives
90+
### Record inputs
91+
92+
Inputs with type `Record` can declare the name and type of each record field:
93+
94+
```nextflow
95+
process fastqc {
96+
input:
97+
sample: Record {
98+
id: String
99+
fastq: Path
100+
}
101+
102+
script:
103+
"""
104+
echo 'id: ${sample.id}'
105+
echo 'fastq: ${sample.fastq}'
106+
"""
107+
}
108+
```
109+
110+
In this example, the record is staged into the task as `sample`, and `sample.fastq` is staged as an input file since the `fastq` field is declared with type `Path`.
111+
112+
When the process is invoked, the incoming record should contain the specified fields, or else the run will fail. If the record has additional fields not declared by the process input, they are ignored.
113+
114+
:::{tip}
115+
Record inputs are a useful way to select a subset of fields from a larger record. This way, the process only stages what it needs, allowing you to keep related data together in your workflow logic.
116+
:::
117+
118+
You can achieve the same behavior using an external record type:
119+
120+
```nextflow
121+
process fastqc {
122+
input:
123+
sample: Sample
124+
125+
script:
126+
"""
127+
echo 'id: ${sample.id}'
128+
echo 'fastq: ${sample.fastq}'
129+
"""
130+
}
131+
132+
record Sample {
133+
id: String
134+
fastq: Path
135+
}
136+
```
137+
138+
This approach is useful when the record type can be re-used elsewhere in the pipeline.
139+
140+
### Tuple inputs
141+
142+
Inputs with type `Tuple` can declare the name of each tuple component:
143+
144+
```nextflow
145+
process fastqc {
146+
input:
147+
(id, fastq): Tuple<String,Path>
148+
149+
script:
150+
"""
151+
echo 'id: ${id}'
152+
echo 'fastq: ${fastq}'
153+
"""
154+
}
155+
```
156+
157+
This pattern is called *tuple destructuring*. Each tuple component is staged into the task the same way as an individual input.
158+
159+
The generic types inside the `Tuple<...>` annotation specify the type of each tuple compomnent and should match the component names. In the above example, `id` has type `String` and `fastq` has type `Path`.
160+
161+
## Stage directives
89162

90163
The `stage:` section defines custom staging behavior using *stage directives*. It should be specified after the `input:` section. These directives serve the same purpose as input qualifiers such as `env` and `stdin` in the legacy syntax.
91164

165+
### Environment variables
166+
92167
The `env` directive declares an environment variable in terms of task inputs:
93168

94169
```nextflow
@@ -106,6 +181,8 @@ process echo_env {
106181
}
107182
```
108183

184+
### Standard input (stdin)
185+
109186
The `stdin` directive defines the standard input of the task script:
110187

111188
```nextflow
@@ -123,6 +200,12 @@ process cat {
123200
}
124201
```
125202

203+
### Custom file staging
204+
205+
:::{versionchanged} 26.04.0
206+
The method signature for `stageAs` was changed from `(filePattern, value)` to `(value, filePattern)`.
207+
:::
208+
126209
The `stageAs` directive stages an input file (or files) under a custom file pattern:
127210

128211
```nextflow
@@ -131,7 +214,7 @@ process blast {
131214
fasta: Path
132215
133216
stage:
134-
stageAs 'query.fa', fasta
217+
stageAs fasta, 'query.fa'
135218
136219
script:
137220
"""
@@ -149,7 +232,7 @@ process grep {
149232
fasta: Path
150233
151234
stage:
152-
stageAs "${id}.fa", fasta
235+
stageAs fasta, "${id}.fa"
153236
154237
script:
155238
"""
@@ -222,6 +305,46 @@ process foo {
222305
}
223306
```
224307
308+
### Structured outputs
309+
310+
Whereas legacy process outputs could only be structured using specific qualifiers like `val` and `tuple`, typed process outputs are regular values.
311+
312+
The `record()` standard library function can be used to create a record:
313+
314+
```nextflow
315+
process fastqc {
316+
input:
317+
sample: Record {
318+
id: String
319+
fastq: Path
320+
}
321+
322+
output:
323+
record(
324+
id: sample.id,
325+
fastqc: file('fastqc_logs')
326+
)
327+
328+
script:
329+
// ...
330+
}
331+
```
332+
333+
The `tuple()` standard library function can be used to create a tuple:
334+
335+
```nextflow
336+
process fastqc {
337+
input:
338+
(id, fastq): Tuple<String,Path>
339+
340+
output:
341+
tuple(id, file('fastqc_logs'))
342+
343+
script:
344+
// ...
345+
}
346+
```
347+
225348
## Topics
226349
227350
The `topic:` section emits values to {ref}`topic channels <channel-topic>`. A topic emission consists of an output value and a topic name:

docs/reference/process.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,10 +73,10 @@ The following directives can be used in the `stage:` section of a typed process:
7373
`env( name: String, String value )`
7474
: Declares an environment variable with the specified name and value in the task environment.
7575

76-
`stageAs( filePattern: String, value: Path )`
76+
`stageAs( value: Path, filePattern: String )`
7777
: Stages a file into the task directory under the given alias.
7878

79-
`stageAs( filePattern: String, value: Iterable<Path> )`
79+
`stageAs( value: Iterable<Path>, filePattern: String )`
8080
: Stages a collection of files into the task directory under the given alias.
8181

8282
`stdin( value: String )`

docs/reference/stdlib-namespaces.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ The global namespace contains globally available constants and functions.
4343
: Create a branch criteria to use with the {ref}`operator-branch` operator.
4444

4545
`env( name: String ) -> String`
46-
: :::{versionadded} 24.11.0-edge
46+
: :::{versionadded} 25.04.0
4747
:::
4848
: Get the value of the environment variable with the specified name in the Nextflow launch environment.
4949

@@ -108,8 +108,11 @@ The global namespace contains globally available constants and functions.
108108
`sleep( milliseconds: long )`
109109
: Sleep for the given number of milliseconds.
110110

111+
`record( [options] ) -> Record`
112+
: Create a record from the given named arguments.
113+
111114
`tuple( args... ) -> Tuple`
112-
: Create a tuple object from the given arguments.
115+
: Create a tuple from the given arguments.
113116

114117
(stdlib-namespaces-channel)=
115118

docs/reference/stdlib-types.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -768,6 +768,35 @@ The following methods are available for splitting and counting the records in fi
768768
`splitText() -> List<String>`
769769
: Splits a text file into a list of lines. See the {ref}`operator-splittext` operator for available options.
770770

771+
(stdlib-types-record)=
772+
773+
## Record
774+
775+
A record is an immutable map of fields to values (i.e., `Map<String,?>`). Each value can have its own type.
776+
777+
A record can be created using the `record` function:
778+
779+
```nextflow
780+
sample = record(id: '1', fastq_1: file('1_1.fastq'), fastq_2: file('1_2.fastq'))
781+
```
782+
783+
Record fields can be accessed as properties:
784+
785+
```nextflow
786+
sample.id
787+
// -> '1'
788+
```
789+
790+
The following operations are supported for records:
791+
792+
`+ : (Record, Record) -> Record`
793+
: Given two records, returns a new record containing the fields and values of both records. When a field is present in both records, the value of the right-hand record takes precedence.
794+
795+
The following methods are available for a record:
796+
797+
`subMap( keys: Iterable<String> ) -> Record`
798+
: Returns a new record containing only the given fields.
799+
771800
(stdlib-types-set)=
772801

773802
## Set\<E\>

docs/reference/syntax.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ A Nextflow script may contain the following top-level declarations:
3434
- Process definitions
3535
- Function definitions
3636
- Enum types
37+
- Record types
3738
- Output block
3839

3940
Script declarations are in turn composed of statements and expressions.
@@ -107,6 +108,8 @@ The following definitions can be included:
107108
- Functions
108109
- Processes
109110
- Named workflows
111+
- *New in 26.04:* Enum types
112+
- *New in 26.04:* Record types
110113

111114
### Params block
112115

@@ -360,9 +363,17 @@ enum Day {
360363

361364
Enum values in the above example can be accessed as `Day.MONDAY`, `Day.TUESDAY`, and so on.
362365

363-
:::{note}
364-
Enum types cannot be included across modules at this time.
365-
:::
366+
### Record type
367+
368+
A record type declaration consists of a name and a body. The body consists of one or more fields, where each field has a name and a type:
369+
370+
```nextflow
371+
record FastqPair {
372+
id: String
373+
fastq_1: Path
374+
fastq_2: Path
375+
}
376+
```
366377

367378
### Output block
368379

docs/script.md

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -111,28 +111,58 @@ Copying a map with the `+` operator is a safer way to modify maps in Nextflow, s
111111

112112
See {ref}`stdlib-types-map` for the set of available map operations.
113113

114+
(script-records)=
115+
116+
## Records
117+
118+
Records are used to store a set of related fields, where each field can have its own type. They are created using the `record` function:
119+
120+
```nextflow
121+
person = record(name: 'Alice', age: 42, is_alive: true)
122+
```
123+
124+
Record fields are accessed by name:
125+
126+
```nextflow
127+
name = person.name
128+
age = person.age
129+
is_alive = person.is_alive
130+
```
131+
132+
Records are immutable -- once a record is created, it cannot be modified. Use record operations to create new records instead.
133+
134+
For example:
135+
136+
```nextflow
137+
person + record(age: 43) - ['is_alive']
138+
139+
// record(name: 'Alice', age: 43)
140+
```
141+
142+
See {ref}`stdlib-types-record` for the set of available record operations.
143+
114144
(script-tuples)=
115145

116146
## Tuples
117147

118148
Tuples are used to store a fixed sequence of heterogeneous values. They are created using the `tuple` function:
119149

120150
```nextflow
121-
person = tuple('Alice', 42, false)
151+
person = tuple('Alice', 42, true)
122152
```
123153

124154
Tuple elements are accessed by index:
125155

126156
```nextflow
127157
name = person[0]
128158
age = person[1]
129-
is_male = person[2]
159+
is_alive = person[2]
130160
```
131161

132162
Tuples can be destructured in assignments:
133163

134164
```nextflow
135-
(name, age, is_male) = person
165+
(name, age, is_alive) = person
136166
```
137167

138168
As well as closure parameters:

0 commit comments

Comments
 (0)