You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: adr/20260306-record-types.md
+81-32Lines changed: 81 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,12 @@
6
6
- Date: 2026-03-06
7
7
- Tags: lang, static-types
8
8
9
+
## Updates
10
+
11
+
### Version 1.1 (2026-03-23)
12
+
13
+
- Replaced inline record type syntax (`Record { ... }`) with destructuring syntax (`record(...)`) for better continuity with legacy syntax and record output syntax.
14
+
9
15
## Summary
10
16
11
17
Provide a way to model composite data types in the Nextflow language.
@@ -155,54 +161,57 @@ When a record is supplied as input to a process, the process needs to know how t
155
161
156
162
Typed processes can stage inputs using the `stage:` section, but ideally the files in a record should be automatically detected and staged.
157
163
158
-
A typed process can declare a record using an *inline record type*:
164
+
A typed process can declare a record input using a record type:
159
165
160
166
```groovy
161
167
process FASTQC {
162
168
input:
163
-
sample: Record {
164
-
id: String
165
-
fastq_1: Path
166
-
fastq_2: Path
167
-
}
169
+
sample: FastqPair
168
170
169
171
// ...
170
172
}
173
+
174
+
record FastqPair {
175
+
id: String
176
+
fastq_1: Path
177
+
fastq_2: Path
178
+
}
171
179
```
172
180
173
181
All record fields that are a `Path` or `Path` collection (e.g. `Set<Path>`) are automatically staged. The record itself is declared in the process body as `sample`, like any other input, and record fields are accessed as `sample.id`, `sample.fastq_1`, and so on.
174
182
175
-
A typed process can also use an explicit record type to achieve the same behavior:
183
+
Alternatively, a typed process can declare a *destructured*record input:
176
184
177
185
```groovy
178
186
process FASTQC {
179
187
input:
180
-
sample: FastqPair
188
+
record(
189
+
id: String,
190
+
fastq_1: Path,
191
+
fastq_2: Path
192
+
)
181
193
182
194
// ...
183
195
}
184
-
185
-
record FastqPair {
186
-
id: String
187
-
fastq_1: Path
188
-
fastq_2: Path
189
-
}
190
196
```
191
197
192
-
The only difference between these two aprooaches is that the `FastqPair` type can be used elsewhere in pipeline code because it is declared externally.
198
+
This approach allows record inputs to be declared without the need for external record types. Each record field is acessed directly as `id`, `fastq_1`, and so on.
193
199
194
200
### Process outputs
195
201
196
202
Typed processes can declare outputs with arbitrary expressions, so no new syntax is required to support record outputs. Simply use the `record()` function to create a record:
197
203
198
204
```groovy
199
205
process FASTQC {
200
-
// ...
206
+
// ...
201
207
202
-
output:
203
-
record(id: id, fastqc: file('fastqc_logs'))
208
+
output:
209
+
record(
210
+
id: id,
211
+
fastqc: file('fastqc_logs')
212
+
)
204
213
205
-
// ...
214
+
// ...
206
215
}
207
216
```
208
217
@@ -258,6 +267,46 @@ println sample.id // -> '1'
258
267
println sample2.id // -> '2'
259
268
```
260
269
270
+
### Inline record input type
271
+
272
+
A process can declare a destructured record input as shown above:
273
+
274
+
```groovy
275
+
process FASTQC {
276
+
input:
277
+
record(
278
+
id: String,
279
+
fastq_1: Path,
280
+
fastq_2: Path
281
+
)
282
+
283
+
// ...
284
+
}
285
+
```
286
+
287
+
One alternative is to declare an *inline record type*:
288
+
289
+
```groovy
290
+
process FASTQC {
291
+
input:
292
+
sample: Record {
293
+
id: String
294
+
fastq_1: Path
295
+
fastq_2: Path
296
+
}
297
+
298
+
// ...
299
+
}
300
+
```
301
+
302
+
This approach was considered because it uses the same syntax as a `record` definition, making it easy to switch between inline and external record types. The block syntax is also slightly better suited for a type definition since it doesn't require commas.
303
+
304
+
However, this approach creates an asymmetry between record inputs and outputs (`Record { ... }` vs `record(...)`). It also removes the ability to destructure a record input.
305
+
306
+
Declaring a record input with `record()` can be understood as a reverse constructor, mirroring the `record()` function used to construct a record output in the `output:` section.
307
+
308
+
While both approaches have pros and cons, the `record()` approach was ultimately chosen for its continuity with the existing tuple syntax and its similarity with the record output syntax.
309
+
261
310
### Implicit process record output
262
311
263
312
A process record output can be defined using the `record()` function as shown above:
@@ -348,16 +397,16 @@ process PROKKA {
348
397
// ...
349
398
350
399
input:
351
-
sample: Record {
352
-
meta: Map
400
+
record(
401
+
meta: Map,
353
402
fasta: Path
354
-
}
403
+
)
355
404
proteins: Path
356
405
prodigal_tf: Path
357
406
358
407
output:
359
408
record(
360
-
meta: sample.meta,
409
+
meta: meta,
361
410
gff: file("${prefix}/*.gff"),
362
411
gbk: file("${prefix}/*.gbk"),
363
412
fna: file("${prefix}/*.fna"),
@@ -376,7 +425,7 @@ process PROKKA {
376
425
file("versions.yml") >> 'versions'
377
426
378
427
script:
379
-
prefix = sample.meta.id
428
+
prefix = meta.id
380
429
// ...
381
430
}
382
431
```
@@ -396,23 +445,23 @@ These processes would be defined as follows:
Copy file name to clipboardExpand all lines: docs/process-typed.md
+31-32Lines changed: 31 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -47,7 +47,8 @@ The `input:` section declares process inputs. In typed processes, each input dec
47
47
```nextflow
48
48
process fastqc {
49
49
input:
50
-
(meta, fastq): Tuple<Map,Path>
50
+
meta: Map
51
+
fastq: Path
51
52
extra_args: String
52
53
53
54
script:
@@ -89,62 +90,62 @@ process cat_opt {
89
90
90
91
### Record inputs
91
92
92
-
Inputs with type`Record`can declare the name and type of each record field:
93
+
Record inputs can be declared using a record type:
93
94
94
95
```nextflow
95
96
process fastqc {
96
97
input:
97
-
sample: Record {
98
-
id: String
99
-
fastq: Path
100
-
}
98
+
sample: Sample
101
99
102
100
script:
103
101
"""
104
102
echo 'id: ${sample.id}'
105
103
echo 'fastq: ${sample.fastq}'
106
104
"""
107
105
}
108
-
```
109
106
110
-
In this example, the record is staged into the task as `sample`, and `sample.fastq` is staged as an input file since the `fastq` field is declared with type`Path`.
107
+
record Sample {
108
+
id: String
109
+
fastq: Path
110
+
}
111
+
```
111
112
112
-
When the process is invoked, the incoming record should contain the specified fields, or elsethe run will fail. If the record has additional fields not declared by the process input, they are ignored.
113
+
In this example, the record input is staged as `sample`, and `sample.fastq` is staged as an input file since it is declared with type`Path`inthe `Sample` record type. Each field inthe record type is staged into the task the same way as an individual input.
113
114
114
-
:::{tip}
115
-
Record inputs are a useful way to selecta subset of fields from a larger record. This way, the process only stages what it needs, allowing you to keep related data together in your workflow logic.
116
-
:::
115
+
When the process is invoked, the incoming record should contain the specified fields, or else the run will fail. If the incoming record has additional fields not declared by the process input, they are ignored.
117
116
118
-
You can achieve the same behavior using an external record type:
117
+
Record inputs can also be declared as a *destructured* input:
119
118
120
119
```nextflow
121
120
process fastqc {
122
121
input:
123
-
sample: Sample
122
+
record(
123
+
id: String,
124
+
fastq: Path
125
+
)
124
126
125
127
script:
126
128
"""
127
-
echo 'id: ${sample.id}'
128
-
echo 'fastq: ${sample.fastq}'
129
+
echo 'id: ${id}'
130
+
echo 'fastq: ${fastq}'
129
131
"""
130
132
}
131
-
132
-
record Sample {
133
-
id: String
134
-
fastq: Path
135
-
}
136
133
```
137
134
138
-
This approach is useful when the record type can be re-used elsewhere in the pipeline.
135
+
This pattern mirrors the standard `record()`functionused to construct records. In this example, `fastq` is staged as an input file since the `fastq` field is declared with type`Path`.
136
+
137
+
:::{tip}
138
+
Record inputs are a useful way to selecta subset of fields from a larger record. This way, the process stages only what it needs, keeping related data together in your workflow logic.
139
+
:::
139
140
140
141
### Tuple inputs
141
142
142
-
Inputs with type`Tuple`can declare the name of each tuple component:
143
+
Tuple inputs can be declared as a *destructured* input:
143
144
144
145
```nextflow
145
146
process fastqc {
146
147
input:
147
-
(id, fastq): Tuple<String,Path>
148
+
tuple(id: String, fastq: Path)
148
149
149
150
script:
150
151
"""
@@ -154,9 +155,7 @@ process fastqc {
154
155
}
155
156
```
156
157
157
-
This pattern is called *tuple destructuring*. Each tuple component is staged into the task the same way as an individual input.
158
-
159
-
The generic types inside the `Tuple<...>` annotation specify the type of each tuple compomnent and should match the component names. In the above example, `id` has type`String` and `fastq` has type`Path`.
158
+
This pattern mirrors the standard `tuple()`functionused to construct tuples. Each tuple component is staged into the task the same way as an individual input.
160
159
161
160
## Stage directives
162
161
@@ -314,14 +313,14 @@ The `record()` standard library function can be used to create a record:
314
313
```nextflow
315
314
process fastqc {
316
315
input:
317
-
sample: Record {
318
-
id: String
316
+
record(
317
+
id: String,
319
318
fastq: Path
320
-
}
319
+
)
321
320
322
321
output:
323
322
record(
324
-
id: sample.id,
323
+
id: id,
325
324
fastqc: file('fastqc_logs')
326
325
)
327
326
@@ -335,7 +334,7 @@ The `tuple()` standard library function can be used to create a tuple:
0 commit comments