Skip to content

Commit d0f2f5e

Browse files
pditommasoclaude
andcommitted
Add ADR: unified record syntax for process inputs and outputs
Propose using the record() function-call notation uniformly for both process inputs and outputs, replacing the asymmetric Record { ... } block syntax currently used in inputs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
1 parent a217a45 commit d0f2f5e

1 file changed

Lines changed: 242 additions & 0 deletions

File tree

Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
# Unified record syntax for process inputs and outputs
2+
3+
- Authors: Paolo Di Tommaso
4+
- Status: proposed
5+
- Deciders: Paolo Di Tommaso, Ben Sherman
6+
- Date: 2026-03-12
7+
- Tags: lang, records, syntax
8+
9+
Technical Story: Follow-up to [Record types ADR](20260306-record-types.md)
10+
11+
## Summary
12+
13+
The current record types implementation uses two different syntactic forms for records in process inputs (block syntax) vs outputs (function-call syntax). This RFC proposes using the `record()` function-call notation uniformly for both inputs and outputs, combined with standard assignment and type annotations.
14+
15+
## Problem Statement
16+
17+
The accepted record types ADR ([20260306-record-types](20260306-record-types.md)) introduces two distinct syntactic forms for records within process definitions:
18+
19+
**Input** — a `Record { ... }` block syntax unique to inputs:
20+
```nextflow
21+
process FASTQC {
22+
input:
23+
sample: Record {
24+
id: String
25+
fastq_1: Path
26+
fastq_2: Path
27+
}
28+
...
29+
}
30+
```
31+
32+
**Output** — a `record()` function call:
33+
```nextflow
34+
process FASTQC {
35+
...
36+
output:
37+
record(id: sample.id, html: file('*.html'), zip: file('*.zip'))
38+
}
39+
```
40+
41+
This asymmetry means the same concept (a record) is expressed with two different syntactic forms depending on context. The block syntax `Record { ... }` exists only in process input declarations and has no counterpart elsewhere in the language. Meanwhile, the `record()` function call used in outputs is already a general-purpose construct usable in any expression context.
42+
43+
## Goals
44+
45+
- **Syntactic consistency** — use a single notation for records across inputs and outputs.
46+
- **Alignment with existing syntax** — reuse assignment (`=`) and type annotation (`: Type`) patterns already present in process I/O, rather than introducing new block syntax.
47+
- **Standard type semantics** — record assignments should follow the same type compatibility rules as any other typed assignment in the language.
48+
49+
## Non-goals
50+
51+
- Changing the top-level `record` type definition syntax — the `record Name { field: Type }` declaration form is a type-level construct and is not affected by this proposal.
52+
- Changing the `record()` function runtime behavior or the `RecordMap` implementation.
53+
- Removing support for external type references (e.g. `sample: Sample`).
54+
55+
## Considered Options
56+
57+
### Option 1: Current syntax (status quo)
58+
59+
Input uses a dedicated block syntax, output uses the `record()` function call:
60+
61+
```nextflow
62+
process FASTQC {
63+
input:
64+
sample: Record {
65+
id: String
66+
fastq_1: Path
67+
fastq_2: Path
68+
}
69+
70+
output:
71+
record(id: sample.id, html: file('*.html'), zip: file('*.zip'))
72+
}
73+
```
74+
75+
- Good, because input block syntax mirrors the top-level `record` definition.
76+
- Bad, because two different notations for the same concept in the same process definition.
77+
- Bad, because `Record { ... }` block syntax only exists in input declarations — it is not a general-purpose construct.
78+
- Bad, because output `record()` as a bare statement (no assignment) doesn't allow naming the output.
79+
80+
### Option 2: Block syntax for both inputs and outputs
81+
82+
Use `record { ... }` blocks in both input and output:
83+
84+
```nextflow
85+
process FASTQC {
86+
input:
87+
record sample {
88+
id: String
89+
fastq_1: Path
90+
fastq_2: Path
91+
}
92+
93+
output:
94+
record {
95+
id: String = sample.id
96+
html: Path = file('*.html')
97+
zip: Path = file('*.zip')
98+
}
99+
}
100+
```
101+
102+
- Good, because symmetric — same block form on both sides.
103+
- Bad, because the output block mixes type declarations with value assignments (`Path = file(...)`).
104+
- Bad, because block syntax in process I/O diverges from the function-call style already established for `record()`.
105+
106+
### Option 3: Unified `record()` function notation with assignment
107+
108+
Use the `record()` function-call syntax for both inputs and outputs, with standard assignment:
109+
110+
```nextflow
111+
process FASTQC {
112+
input:
113+
sample = record(id: String, fastq_1: Path, fastq_2: Path)
114+
115+
output:
116+
result = record(id: sample.id, html: file('*.html'), zip: file('*.zip'))
117+
}
118+
```
119+
120+
With optional explicit type annotation:
121+
122+
```nextflow
123+
process FASTQC {
124+
input:
125+
sample: Sample = record(id: String, fastq_1: Path, fastq_2: Path)
126+
127+
output:
128+
result: QcResult = record(id: sample.id, html: file('*.html'), zip: file('*.zip'))
129+
}
130+
```
131+
132+
- Good, because same notation on both sides — `name = record(...)`.
133+
- Good, because reuses existing assignment and type annotation patterns.
134+
- Good, because `record()` is already a general-purpose function, no new syntax needed.
135+
- Good, because type annotations follow standard rules — `sample: Sample = record(...)` works like any typed assignment.
136+
- Bad, because input `record()` arguments are types rather than values, which is a different usage of the function.
137+
138+
## Solution or decision outcome
139+
140+
**Option 3**: Use the `record()` function-call notation uniformly for both process inputs and outputs, combined with standard assignment (`=`) and optional type annotation (`: Type`).
141+
142+
## Rationale & discussion
143+
144+
The key insight is that the `record()` call is just a constructor, and everything else is standard Nextflow assignment and type annotation. This eliminates the need for a dedicated `Record { ... }` block syntax in process inputs.
145+
146+
### Syntax pattern
147+
148+
The unified pattern is `name: Type = record(...)` for both inputs and outputs:
149+
150+
- **Input**: `sample = record(id: String, fastq_1: Path, fastq_2: Path)` — declares the fields and their types being received.
151+
- **Output**: `result = record(id: sample.id, html: file('*.html'))` — declares the fields and their values being produced.
152+
153+
The only difference is what goes inside the `record()` call — types on input (declaring structure), expressions on output (producing values). This parallels how assignment works elsewhere: the left side declares, the right side provides.
154+
155+
### Type annotations
156+
157+
Type annotations are optional and follow standard semantics:
158+
159+
```nextflow
160+
// Inferred type from record fields
161+
sample = record(id: String, fastq_1: Path, fastq_2: Path)
162+
163+
// Explicit type — compiler checks compatibility with Sample
164+
sample: Sample = record(id: String, fastq_1: Path, fastq_2: Path)
165+
```
166+
167+
This is the same as writing `x: Integer = 42` vs `x = 42` — nothing record-specific about the assignment semantics.
168+
169+
### Alignment with existing process syntax
170+
171+
The proposed syntax reuses patterns that already exist in Nextflow process definitions:
172+
173+
| Existing pattern | Example | Record equivalent |
174+
|-----------------|---------|-------------------|
175+
| Assignment in output | `id = sample.id` | `result = record(...)` |
176+
| Typed assignment in output | `id: String = sample.id` | `result: QcResult = record(...)` |
177+
| Type annotation in input | `id: String` | `sample: Sample = record(...)` |
178+
179+
### External type reference
180+
181+
When using a pre-defined record type, the syntax naturally simplifies:
182+
183+
```nextflow
184+
// With inline fields
185+
sample: Sample = record(id: String, fastq_1: Path, fastq_2: Path)
186+
187+
// With external type only (no inline fields needed)
188+
sample: Sample
189+
```
190+
191+
The `sample: Sample` shorthand remains valid — the `record()` call is only needed when defining fields inline.
192+
193+
### Full example
194+
195+
```nextflow
196+
nextflow.preview.types = true
197+
198+
record Sample {
199+
id: String
200+
fastq_1: Path
201+
fastq_2: Path
202+
}
203+
204+
process TOUCH {
205+
input:
206+
id: String
207+
208+
output:
209+
result = record(id: id, fastq_1: file('*_1.fastq'), fastq_2: file('*_2.fastq'))
210+
211+
script:
212+
"""
213+
touch ${id}_1.fastq
214+
touch ${id}_2.fastq
215+
"""
216+
}
217+
218+
process FASTQC {
219+
input:
220+
sample: Sample = record(id: String, fastq_1: Path, fastq_2: Path)
221+
222+
output:
223+
result = record(id: sample.id, html: file('*.html'), zip: file('*.zip'))
224+
225+
script:
226+
"""
227+
touch ${sample.id}.html
228+
touch ${sample.id}.zip
229+
"""
230+
}
231+
232+
workflow {
233+
ch_samples = TOUCH(channel.of('a', 'b', 'c'))
234+
ch_fastqc = FASTQC(ch_samples)
235+
ch_fastqc.view()
236+
}
237+
```
238+
239+
## Links
240+
241+
- Supersedes input syntax in [Record types ADR](20260306-record-types.md)
242+
- Related: [Record types syntax summary](../plans/record-types-syntax-new.md)

0 commit comments

Comments
 (0)