Background
A pipeline-wide review against origin/dev (post-3.26.0, SHA e64c3f753) surfaced a handful of small, no-behaviour-change items that are cheaper to bundle into one PR than to ship separately. None of them affect runtime behaviour or test snapshots; the single correctness fix (& → &&) is included here because it's a 3-line trivial change in code paths that already evaluate to the right value (Groovy bitwise-AND on truthy/falsy operands gives the same result as logical-AND).
This is a single-PR, single-CI-run task. Each sub-item below is independently verifiable.
Tasks
1. README QC list — Preseq is not a default; RustQC is missing
README.md lines 46-50 list Preseq inside the "Extensive quality control" section as if it runs by default:
14. Extensive quality control:
1. [`RSeQC`](http://rseqc.sourceforge.net/)
2. [`Qualimap`](http://qualimap.bioinfo.cipf.es/)
3. [`dupRadar`](https://bioconductor.org/packages/release/bioc/html/dupRadar.html)
4. [`Preseq`](http://smithlabresearch.org/software/preseq/)
5. [`DESeq2`](https://bioconductor.org/packages/release/bioc/html/DESeq2.html)
But nextflow.config:107 defaults skip_preseq = true, and docs/usage.md:398 correctly says "does not normally run Preseq". RustQC has full sections in docs/usage.md:372-409 and docs/output.md:546+ but is not mentioned in the README at all.
Action: in README.md, mark Preseq as optional (e.g. [Preseq](...) (*disabled by default*, enable with --skip_preseq falseor via--use_rustqc)). Add one bullet pointing at RustQC as a unified replacement for RSeQC/Qualimap/dupRadar/Preseq/SAMtools-stats, e.g. Or use [RustQC](https://github.com/seqeralabs/rustqc) (single-pass replacement for RSeQC/Qualimap/dupRadar/Preseq, enabled with --use_rustqc — see usage docs).
2. Schema: drop TODO comments leaked into user-facing description text
nextflow_schema.json:553 and :559 — the kallisto_quant_fraglen and kallisto_quant_fraglen_sd description fields end with "TODO: use existing RSeQC results to do this dynamically.". This text is rendered verbatim in the schema-generated help.
Action:
- Strip the
TODO: ... sentence from both descriptions.
- Add a unit clarification — these values are in base pairs.
- If the TODO is still genuinely on someone's roadmap, open a tracking issue and link it from a code comment in
workflows/rnaseq/main.nf near the kallisto invocation; do not leave it in user-facing schema text.
3. Schema: declare "default": false for skip_* params that are missing it
In the QC skip block of nextflow_schema.json, only skip_preseq (line 800) declares "default": true. The others default to false in nextflow.config but the schema doesn't say so:
skip_dupradar (line 803) — nextflow.config:108 defaults to false
skip_qualimap (line 808) — nextflow.config:109 defaults to false
skip_rseqc (line 813) — nextflow.config:118 defaults to false
skip_biotype_qc (line 818) — verify default in nextflow.config
skip_deseq2_qc (line 823) — verify default in nextflow.config
Action: add "default": false to each of the five entries to match nextflow.config. While there, append a note to each description that the param has no effect when --use_rustqc is enabled (RustQC subsumes these tools).
4. workflows/rnaseq/main.nf:429,459,782 — & → &&
Three guard conditions use bitwise AND instead of logical AND:
if (!params.skip_qc & !params.skip_deseq2_qc & !params.skip_quantification_merge) {
Behaviour is identical because operands are truthy/falsy, but it's inconsistent with the rest of the file (see workflows/rnaseq/main.nf:638 for an && example) and is a recurring "did you mean...?" question for reviewers.
Action: change all three lines to use &&. Pure correctness/style fix, no snapshot impact.
5. modules/local/deseq2_qc/main.nf — replace shell-substitution loop in stub
Lines 75-78 of the stub block use backtick command-substitution against the $counts input file:
for i in `head $counts -n 1 | cut -f3-`;
do
touch size_factors/\${i}.size_factors.RData
done
Stubs should produce deterministic output without depending on input content. nf-core convention is fixed touch calls.
Action: replace the loop with one or two fixed touches, e.g. touch size_factors/sample1.size_factors.RData. If a richer set is needed for downstream stub testing, add a comment explaining why.
6. conf/modules/featurecounts.config:14 — drop duplicate withName block
The selector withName: 'CUSTOM_MULTIQCCUSTOMBIOTYPE' exists in two files:
conf/modules/multiqc_custom_biotype.config:2 (the natural home)
conf/modules/featurecounts.config:14 (duplicate)
Per inclusion order in nextflow.config, the multiqc_custom_biotype.config block wins, but the featurecounts.config block silently shadows what readers expect to be authoritative.
Action: delete lines 14-20 (the entire withName: 'CUSTOM_MULTIQCCUSTOMBIOTYPE' block) from conf/modules/featurecounts.config. Keep conf/modules/multiqc_custom_biotype.config as the single source.
7. conf/modules/align_star.config:6,140 — merge two adjacent withName blocks for the same selector
Two consecutive blocks use the character-identical selector '.*ALIGN_STAR:STAR_ALIGN|.*ALIGN_STAR:SENTIEON_STAR_ALIGN|.*ALIGN_STAR:PARABRICKS_RNA_FQ2BAM'. The first block sets ext.args, the second sets publishDir.
Action: merge the two blocks into one with both ext.args and publishDir. No behaviour change; pure readability.
8. Remove bin/fastq_dir_to_samplesheet.py
The script is not called by any .nf file in the pipeline. Git history confirms it is unmaintained:
- First added 2021-06-17 (
fb916c7f0)
- Last functional change in 2023; the 2023-11 commit was a cross-repo "Add authors and licenses to scripts in bin/ where missing" sweep, not a content change
- Zero commits in the last 2 years (since 2024-05)
- The CHANGELOG already directs users to
nf-core/fetchngs for samplesheet generation, which supersedes this script's purpose
Action:
- Delete
bin/fastq_dir_to_samplesheet.py.
- Add a CHANGELOG entry under the next release: one sentence noting the removal and pointing users at
nf-core/fetchngs (e.g. "Remove unmaintained bin/fastq_dir_to_samplesheet.py — use nf-core/fetchngs for samplesheet generation").
- If
docs/usage.md references the script anywhere (grep first), strip those references and replace with a one-liner pointing to nf-core/fetchngs.
Verification
- Run
nf-core pipelines lint from the worktree — schema changes should pass.
- Run
nf-test test --profile=+test,docker --tag default to exercise the changed main.nf paths once. Expect zero snapshot diffs (none of these items change runtime).
- Eyeball the generated MultiQC report from
results/multiqc/ to confirm the README/schema text changes haven't broken any anchor links.
Acceptance criteria
Notes for the implementer
- All eight items are independent. If any one of them turns out to be more involved than expected (e.g. doc rewrite for
fastq_dir_to_samplesheet.py reveals broken behaviour), drop it from this PR and open a follow-up issue rather than expanding the scope.
- Keep the diff scannable. If after sub-items 1-7 the diff is already large, drop sub-item 8 to a follow-up.
Background
A pipeline-wide review against
origin/dev(post-3.26.0, SHAe64c3f753) surfaced a handful of small, no-behaviour-change items that are cheaper to bundle into one PR than to ship separately. None of them affect runtime behaviour or test snapshots; the single correctness fix (&→&&) is included here because it's a 3-line trivial change in code paths that already evaluate to the right value (Groovy bitwise-AND on truthy/falsy operands gives the same result as logical-AND).This is a single-PR, single-CI-run task. Each sub-item below is independently verifiable.
Tasks
1. README QC list — Preseq is not a default; RustQC is missing
README.mdlines 46-50 list Preseq inside the "Extensive quality control" section as if it runs by default:But
nextflow.config:107defaultsskip_preseq = true, anddocs/usage.md:398correctly says "does not normally run Preseq". RustQC has full sections indocs/usage.md:372-409anddocs/output.md:546+but is not mentioned in the README at all.Action: in
README.md, mark Preseq as optional (e.g.[Preseq](...) (*disabled by default*, enable with--skip_preseq falseor via--use_rustqc)). Add one bullet pointing at RustQC as a unified replacement for RSeQC/Qualimap/dupRadar/Preseq/SAMtools-stats, e.g.Or use [RustQC](https://github.com/seqeralabs/rustqc) (single-pass replacement for RSeQC/Qualimap/dupRadar/Preseq, enabled with --use_rustqc — see usage docs).2. Schema: drop TODO comments leaked into user-facing description text
nextflow_schema.json:553and:559— thekallisto_quant_fraglenandkallisto_quant_fraglen_sddescriptionfields end with"TODO: use existing RSeQC results to do this dynamically.". This text is rendered verbatim in the schema-generated help.Action:
TODO: ...sentence from both descriptions.workflows/rnaseq/main.nfnear the kallisto invocation; do not leave it in user-facing schema text.3. Schema: declare
"default": falseforskip_*params that are missing itIn the QC skip block of
nextflow_schema.json, onlyskip_preseq(line 800) declares"default": true. The others default tofalseinnextflow.configbut the schema doesn't say so:skip_dupradar(line 803) —nextflow.config:108defaults tofalseskip_qualimap(line 808) —nextflow.config:109defaults tofalseskip_rseqc(line 813) —nextflow.config:118defaults tofalseskip_biotype_qc(line 818) — verify default innextflow.configskip_deseq2_qc(line 823) — verify default innextflow.configAction: add
"default": falseto each of the five entries to matchnextflow.config. While there, append a note to each description that the param has no effect when--use_rustqcis enabled (RustQC subsumes these tools).4.
workflows/rnaseq/main.nf:429,459,782—&→&&Three guard conditions use bitwise AND instead of logical AND:
Behaviour is identical because operands are truthy/falsy, but it's inconsistent with the rest of the file (see
workflows/rnaseq/main.nf:638for an&&example) and is a recurring "did you mean...?" question for reviewers.Action: change all three lines to use
&&. Pure correctness/style fix, no snapshot impact.5.
modules/local/deseq2_qc/main.nf— replace shell-substitution loop in stubLines 75-78 of the stub block use backtick command-substitution against the
$countsinput file:Stubs should produce deterministic output without depending on input content. nf-core convention is fixed
touchcalls.Action: replace the loop with one or two fixed touches, e.g.
touch size_factors/sample1.size_factors.RData. If a richer set is needed for downstream stub testing, add a comment explaining why.6.
conf/modules/featurecounts.config:14— drop duplicatewithNameblockThe selector
withName: 'CUSTOM_MULTIQCCUSTOMBIOTYPE'exists in two files:conf/modules/multiqc_custom_biotype.config:2(the natural home)conf/modules/featurecounts.config:14(duplicate)Per inclusion order in
nextflow.config, themultiqc_custom_biotype.configblock wins, but thefeaturecounts.configblock silently shadows what readers expect to be authoritative.Action: delete lines 14-20 (the entire
withName: 'CUSTOM_MULTIQCCUSTOMBIOTYPE'block) fromconf/modules/featurecounts.config. Keepconf/modules/multiqc_custom_biotype.configas the single source.7.
conf/modules/align_star.config:6,140— merge two adjacentwithNameblocks for the same selectorTwo consecutive blocks use the character-identical selector
'.*ALIGN_STAR:STAR_ALIGN|.*ALIGN_STAR:SENTIEON_STAR_ALIGN|.*ALIGN_STAR:PARABRICKS_RNA_FQ2BAM'. The first block setsext.args, the second setspublishDir.Action: merge the two blocks into one with both
ext.argsandpublishDir. No behaviour change; pure readability.8. Remove
bin/fastq_dir_to_samplesheet.pyThe script is not called by any
.nffile in the pipeline. Git history confirms it is unmaintained:fb916c7f0)nf-core/fetchngsfor samplesheet generation, which supersedes this script's purposeAction:
bin/fastq_dir_to_samplesheet.py.nf-core/fetchngs(e.g. "Remove unmaintainedbin/fastq_dir_to_samplesheet.py— usenf-core/fetchngsfor samplesheet generation").docs/usage.mdreferences the script anywhere (grep first), strip those references and replace with a one-liner pointing tonf-core/fetchngs.Verification
nf-core pipelines lintfrom the worktree — schema changes should pass.nf-test test --profile=+test,docker --tag defaultto exercise the changedmain.nfpaths once. Expect zero snapshot diffs (none of these items change runtime).results/multiqc/to confirm the README/schema text changes haven't broken any anchor links.Acceptance criteria
nf-core pipelines lintpassesnf-test test --tag defaultpasses with zero snapshot diffsNotes for the implementer
fastq_dir_to_samplesheet.pyreveals broken behaviour), drop it from this PR and open a follow-up issue rather than expanding the scope.