Skip to content

Commit 56f607c

Browse files
authored
Merge pull request #573 from nf-core/dev
PR for release 2.2.0 "Ulm"
2 parents 26ae7e9 + 0b62ce5 commit 56f607c

136 files changed

Lines changed: 63175 additions & 5229 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.dockstore.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Dockstore config version, not pipeline version
2+
version: 1.2
3+
workflows:
4+
- subclass: nfl
5+
primaryDescriptorPath: /nextflow.config

.github/.dockstore.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Dockstore config version, not pipeline version
2+
version: 1.2
3+
workflows:
4+
- subclass: nfl
5+
primaryDescriptorPath: /nextflow.config

.github/CONTRIBUTING.md

Lines changed: 154 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,9 @@ If you'd like to write some code for nf-core/eager, the standard workflow is as
1818
1. Check that there isn't already an issue about your idea in the [nf-core/eager issues](https://github.com/nf-core/eager/issues) to avoid duplicating work
1919
* If there isn't one already, please create one so that others know you're working on this
2020
2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-core/eager repository](https://github.com/nf-core/eager) to your GitHub account
21-
3. Make the necessary changes / additions within your forked repository
22-
4. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged
21+
3. Make the necessary changes / additions within your forked repository (following [code contribution guidelines](https://github.com/nf-core/eager/blob/dev/.github/CONTRIBUTING.md))
22+
4. Use `nf-core schema build .` and add any new parameters to the pipeline JSON schema (requires nf-core tools >= 1.10).
23+
5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged
2324

2425
If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/).
2526

@@ -46,12 +47,161 @@ These tests are run both with the latest available version of `Nextflow` and als
4647

4748
## Patch
4849

49-
: warning: Only in the unlikely and regretful event of a release happening with a bug.
50+
:warning: Only in the unlikely and regretful event of a release happening with a bug.
5051

5152
* On your own fork, make a new branch `patch` based on `upstream/master`.
5253
* Fix the bug, and bump version (X.Y.Z+1).
5354
* A PR should be made on `master` from patch to directly this particular bug.
5455

5556
## Getting help
5657

57-
For further information/help, please consult the [nf-core/eager documentation](https://nf-co.re/nf-core/eager/docs) and don't hesitate to get in touch on the nf-core Slack [#eager](https://nfcore.slack.com/channels/eager) channel ([join our Slack here](https://nf-co.re/join/slack)).
58+
For further information/help, please consult the [nf-core/eager documentation](https://nf-co.re/eager/latest/usage) and don't hesitate to get in touch on the nf-core Slack [#eager](https://nfcore.slack.com/channels/eager) channel ([join our Slack here](https://nf-co.re/join/slack)).
59+
60+
# Code Contribution Guidelines
61+
62+
To make the EAGER2 code and processing logic more understandable for new contributors, and to ensure quality. We are making an attempt to somewhat-standardise the way the code is written.
63+
64+
If you wish to contribute a new module, please use the following coding standards.
65+
66+
The typical workflow for adding a new module is as follows:
67+
68+
1. Define the corresponding input channel into your new process from the expected previous process channel (or re-routing block, see below).
69+
2. Write the process block (see below).
70+
3. Define the output channel if needed (see below).
71+
4. Add any new flags/options to `nextflow.config` with a default (see below).
72+
5. Add any new flags/options to `nextflow_schema.json` with help text (with `nf-core schema build .`)
73+
6. Add any new flags/options to the help message (for integer/text parameters, print to help the corresponding `nextflow.config` parameter).
74+
7. Add sanity checks for all relevant parameters.
75+
8. Add any new software to the `scrape_software_versions.py` script in `bin/` and the version command to the `scrape_software_versions` process in `main.nf`.
76+
9. Do local tests that the new code works properly and as expected.
77+
10. Add a new test command in `.github/workflow/ci.yaml`.
78+
11. If applicable add a [MultiQC](https://https://multiqc.info/) module.
79+
12. Update MultiQC config `assets/multiqc_config.yaml` so relevant suffixes, name clean up, General Statistics Table column order, and module figures are in the right order.
80+
13. Add new flags/options to 'usage' documentation under `docs/usage.md`.
81+
14. Add any descriptions of MultiQC report sections and output files to `docs/output.md`.
82+
83+
## Default Values
84+
85+
Default values should go in `nextflow.config` under the `params` scope, and `nextflow_schema.json` (latter with `nf-core schema build .`)
86+
87+
## Default resource processes
88+
89+
Defining recommended 'minimum' resource requirements (CPUs/Memory) for a process should be defined in `conf/base.config`. This can be utilised within the process using `${task.cpu}` or `${task.memory}` variables in the `script:` block.
90+
91+
## Process Concept
92+
93+
We are providing a highly configurable pipeline, with many options to turn on and off different processes in different combinations. This can make a very complex graph structure that can cause a large amount of duplicated channels coming out of every process to account for each possible combination.
94+
95+
The EAGER pipeline can currently be broken down into the following 'stages', where a stage is a collection of non-terminal mutually exclusive processes, which is the output of which is used for another file reporting module (but not reporting!) .
96+
97+
* Input
98+
* Convert BAM
99+
* PolyG Clipping
100+
* AdapterRemoval
101+
* Mapping (either `bwa`, `bwamem`, or `circularmapper`)
102+
* BAM Filtering
103+
* Deduplication (either `dedup` or `markduplicates`)
104+
* BAM Trimming
105+
* PMDtools
106+
* Genotyping
107+
108+
Every step can potentially be skipped, therefore the output of a previous stage must be able to be passed to the next stage, if the given stage is not run.
109+
110+
To somewhat simplify this logic, we have implemented the following structure.
111+
112+
The concept is as follows:
113+
114+
* Every 'stage' of the pipeline (i.e. collection of mutually exclusive processes) must always have a if else statement following it.
115+
* This if else 'bypass' statement collects and standardises all possible input files into single channel(s) for the next stage.
116+
* Importantly - within the bypass statement, a channel from the previous stage's bypass mixes into these output channels. This additional channel is named `ch_previousstage_for_skipcurrentstage`. This contains the output from the previous stage, i.e. not the modified version from the current stage.
117+
* The bypass statement works as follows:
118+
* If the current stage is turned on: will mix the previous stage and current stage output and filter for file suffixes unique to the current stage output
119+
* If the current stage is turned off or skipped: will mix the previous stage and current stage output. However as there there is no files in the output channel from the current stage, no filtering is required and the files in the 'ch_XXX_for_skipXXX' stage will be used.
120+
121+
This ensures the same channel inputs to the next stage is 'homogeneous' - i.e. all comes from the same source (the bypass statement)
122+
123+
An example schematic can be given as follows
124+
125+
```nextflow
126+
// PREVIOUS STAGE OUTPUT
127+
if (params.run_bam_filtering) {
128+
ch_input_for_skipconvertbam.mix(ch_output_ch_convertbam)
129+
.filter{ it =~/.*converted.fq/}
130+
.into { ch_convertbam_for_fastp; ch_convertbam_for_skipfastp }
131+
} else {
132+
ch_input_for_skipconvertbam
133+
.into { ch_convertbam_for_fastp; ch_convertbam_for_skipfastp }
134+
}
135+
136+
// SKIPPABLE CURRENT STAGE PROCESS
137+
process fastp {
138+
publishDir "${params.outdir}/fastp", mode: 'copy'
139+
140+
when:
141+
params.run_fastp
142+
143+
input:
144+
file fq from ch_convertbam_for_fastp
145+
146+
output:
147+
file "*pG.fq" into ch_output_from_fastp
148+
149+
script:
150+
"""
151+
echo "I have been fastp'd" > ${fq}
152+
mv ${fq} ${fq}.pG.fq
153+
"""
154+
}
155+
156+
// NEXT STAGE INPUT PREPARATION
157+
if (params.run_fastp) {
158+
ch_convertbam_for_skipfastp.mix(ch_output_from_fastp)
159+
.filter { it =~/.*pG.fq/ }
160+
.into { ch_fastp_for_adapterremoval; ch_fastp_for_skipadapterremoval }
161+
} else {
162+
ch_convertbam_for_skipfastp
163+
.into { ch_fastp_for_adapterremoval; ch_fastp_for_skipadapterremoval }
164+
}
165+
166+
```
167+
168+
## Naming Schemes
169+
170+
Please use the following naming schemes, to make it easy to understand what is going where.
171+
172+
* process output: `ch_output_from_<process>`(this should always go into the bypass statement described above).
173+
* skipped process output: `ch_<previousstage>_for_<skipprocess>`(this goes out of the bypass statement described above)
174+
* process inputs: `ch_<previousstage>_for_<process>` (this goes into a process)
175+
176+
## Nextflow Version Bumping
177+
178+
If you have agreement from reviewers, you may bump the 'default' minimum version of nextflow (e.g. for testing), with `nf-core bump-version`.
179+
180+
## Software Version Reporting
181+
182+
If you add a new tool to the pipeline, please ensure you add the information of the tool to the `get_software_version` process.
183+
184+
Add to the script block of the process, something like the following:
185+
186+
```bash
187+
<YOUR_TOOL> --version &> v_<YOUR_TOOL>.txt 2>&1 || true
188+
```
189+
190+
or
191+
192+
```bash
193+
<YOUR_TOOL> --help | head -n 1 &> v_<YOUR_TOOL>.txt 2>&1 || true
194+
```
195+
196+
You then need to edit the script `bin/scrape_software_versions.py` to
197+
198+
1. add a (python) regex for your tools --version output (as in stored in the `v_<YOUR_TOOL>.txt` file), to ensure the version is reported as a `v` and the version number e.g. `v2.1.1`
199+
2. add a HTML block entry to the `OrderedDict` for formatting in MultiQC.
200+
201+
> If a tool does not unfortunately offer any printing of version data, you may add this 'manually' e.g. with `echo "v1.1" > v_<YOUR_TOOL>.txt`
202+
203+
## Images and Figures
204+
205+
For all internal nf-core/eager documentation images we are using the 'Kalam' font by the Indian Type Foundry and licensed under the Open Font License. It can be found for download here [here](https://fonts.google.com/specimen/Kalam).
206+
207+
For the overview image we follow the nf-core [style guidelines](https://nf-co.re/developers/design_guidelines).

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,24 @@
11
# nf-core/eager bug report
22

3+
<!--
34
Hi there!
45
56
Thanks for telling us about a problem with the pipeline.
7+
68
Please delete this text and anything that's not relevant from the template below:
9+
-->
10+
11+
## Check Documentation
12+
13+
Have you checked in the following places for your error?:
14+
15+
- [ ] [Frequently Asked Questions](https://github.com/nf-core/eager/blob/master/docs/usage.md#troubleshooting-and-faqs)
16+
(for nf-core/eager specific information)
17+
- [ ] [Troubleshooting](https://nf-co.re/usage/troubleshooting)
18+
(for nf-core specific information)
19+
20+
Please also check the the corresponding version's documentation on github, if not
21+
testing the latest release.
722

823
## Describe the bug
924

@@ -20,6 +35,12 @@ Steps to reproduce the behaviour:
2035

2136
A clear and concise description of what you expected to happen.
2237

38+
## Log files
39+
40+
1. The command used to run the pipeline
41+
2. The `.nextflow.log` file (which is a hidden file in whichever place you _ran_
42+
the pipeline from - not necessarily in the output directory!)
43+
2344
## System
2445

2546
- Hardware: <!-- [e.g. HPC, Desktop, Cloud...] -->
Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,26 @@
11
# nf-core/eager feature request
2+
<!--
23
34
Hi there!
45
56
Thanks for suggesting a new feature for the pipeline!
67
Please delete this text and anything that's not relevant from the template below:
8+
-->
79

810
## Is your feature request related to a problem? Please describe
911

10-
A clear and concise description of what the problem is.
12+
<!-- A clear and concise description of what the problem is. -->
1113

12-
Ex. I'm always frustrated when [...]
14+
<!-- e.g. [I'm always frustrated when ...] -->
1315

1416
## Describe the solution you'd like
1517

16-
A clear and concise description of what you want to happen.
18+
<!-- A clear and concise description of what you want to happen. -->
1719

1820
## Describe alternatives you've considered
1921

20-
A clear and concise description of any alternative solutions or features you've considered.
22+
<!-- A clear and concise description of any alternative solutions or features you've considered. -->
2123

2224
## Additional context
2325

24-
Add any other context about the feature request here.
26+
<!-- Add any other context about the feature request here. -->

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
# nf-core/eager pull request
2+
<!--
23
34
Many thanks for contributing to nf-core/eager!
45
56
Please fill in the appropriate checklist below (delete whatever is not relevant).
67
These are the most common things requested on pull requests (PRs).
8+
-->
79

810
## PR checklist
911

.github/markdownlint.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,6 @@ no-inline-html:
77
allowed_elements:
88
- img
99
- p
10+
- kbd
11+
- details
12+
- summary

.github/workflows/awsfulltest.yml

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
2+
name: nf-core AWS full size tests
3+
# This workflow is triggered on push to the master branch.
4+
5+
on:
6+
release:
7+
types: [published]
8+
9+
jobs:
10+
run-awstest:
11+
name: Run AWS full tests
12+
if: github.repository == 'nf-core/eager'
13+
runs-on: ubuntu-latest
14+
steps:
15+
- name: Setup Miniconda
16+
uses: goanpeca/setup-miniconda@v1.0.2
17+
with:
18+
auto-update-conda: true
19+
python-version: 3.7
20+
- name: Install awscli
21+
run: conda install -c conda-forge awscli
22+
- name: Start AWS batch job
23+
# Add full size test data (but still relatively small datasets for few samples)
24+
# on the `test_full.config` test runs with only one set of parameters
25+
# Then specify `-profile test_full` instead of `-profile test` on the AWS batch command
26+
env:
27+
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
28+
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
29+
TOWER_ACCESS_TOKEN: ${{ secrets.AWS_TOWER_TOKEN }}
30+
AWS_JOB_DEFINITION: ${{ secrets.AWS_JOB_DEFINITION }}
31+
AWS_JOB_QUEUE: ${{ secrets.AWS_JOB_QUEUE }}
32+
AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}
33+
run: |
34+
aws batch submit-job \
35+
--region eu-west-1 \
36+
--job-name nf-core-eager \
37+
--job-queue $AWS_JOB_QUEUE \
38+
--job-definition $AWS_JOB_DEFINITION \
39+
--container-overrides '{"command": ["nf-core/eager", "-r '"${GITHUB_SHA}"' -profile awsfulltest --outdir s3://'"${AWS_S3_BUCKET}"'/eager/results-'"${GITHUB_SHA}"' -w s3://'"${AWS_S3_BUCKET}"'/eager/work-'"${GITHUB_SHA}"' -with-tower"], "environment": [{"name": "TOWER_ACCESS_TOKEN", "value": "'"$TOWER_ACCESS_TOKEN"'"}]}'

.github/workflows/awstest.yml

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
name: nf-core AWS test
2+
# This workflow is triggered on push to the master branch.
3+
# It runs the -profile 'test' on AWS batch
4+
5+
on:
6+
push:
7+
branches:
8+
- master
9+
10+
jobs:
11+
run-awstest:
12+
name: Run AWS tests
13+
if: github.repository == 'nf-core/eager'
14+
runs-on: ubuntu-latest
15+
steps:
16+
- name: Setup Miniconda
17+
uses: goanpeca/setup-miniconda@v1.0.2
18+
with:
19+
auto-update-conda: true
20+
python-version: 3.7
21+
- name: Install awscli
22+
run: conda install -c conda-forge awscli
23+
- name: Start AWS batch job
24+
# For example: adding multiple test runs with different parameters
25+
# Remember that you can parallelise this by using strategy.matrix
26+
env:
27+
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
28+
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
29+
TOWER_ACCESS_TOKEN: ${{ secrets.AWS_TOWER_TOKEN }}
30+
AWS_JOB_DEFINITION: ${{ secrets.AWS_JOB_DEFINITION }}
31+
AWS_JOB_QUEUE: ${{ secrets.AWS_JOB_QUEUE }}
32+
AWS_S3_BUCKET: ${{ secrets.AWS_S3_BUCKET }}
33+
run: |
34+
aws batch submit-job \
35+
--region eu-west-1 \
36+
--job-name nf-core-eager \
37+
--job-queue $AWS_JOB_QUEUE \
38+
--job-definition $AWS_JOB_DEFINITION \
39+
--container-overrides '{"command": ["nf-core/eager", "-r '"${GITHUB_SHA}"' -profile test_tsv_complex --outdir s3://'"${AWS_S3_BUCKET}"'/eager/results-'"${GITHUB_SHA}"' -w s3://'"${AWS_S3_BUCKET}"'/eager/work-'"${GITHUB_SHA}"' -with-tower"], "environment": [{"name": "TOWER_ACCESS_TOKEN", "value": "'"$TOWER_ACCESS_TOKEN"'"}]}'eager/work-'"${GITHUB_SHA}"' -with-tower"], "environment": [{"name": "TOWER_ACCESS_TOKEN", "value": "'"$TOWER_ACCESS_TOKEN"'"}]}'

.github/workflows/branch.yml

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,35 @@ name: nf-core branch protection
33
# It fails when someone tries to make a PR against the nf-core `master` branch instead of `dev`
44
on:
55
pull_request:
6-
branches:
7-
- master
6+
branches: [master]
87

98
jobs:
109
test:
11-
runs-on: ubuntu-18.04
10+
runs-on: ubuntu-latest
1211
steps:
13-
# PRs are only ok if coming from an nf-core `dev` branch or a fork `patch` branch
12+
# PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
1413
- name: Check PRs
14+
if: github.repository == 'nf-core/eager'
1515
run: |
16-
{ [[ $(git remote get-url origin) == *nf-core/eager ]] && [[ ${GITHUB_HEAD_REF} = "dev" ]]; } || [[ ${GITHUB_HEAD_REF} == "patch" ]]
16+
{ [[ ${{github.event.pull_request.head.repo.full_name}} == nf-core/eager ]] && [[ $GITHUB_HEAD_REF = "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
17+
18+
19+
# If the above check failed, post a comment on the PR explaining the failure
20+
# NOTE - this doesn't currently work if the PR is coming from a fork, due to limitations in GitHub actions secrets
21+
- name: Post PR comment
22+
if: failure()
23+
uses: mshick/add-pr-comment@v1
24+
with:
25+
message: |
26+
Hi @${{ github.event.pull_request.user.login }},
27+
28+
It looks like this pull-request is has been made against the ${{github.event.pull_request.head.repo.full_name}} `master` branch.
29+
The `master` branch on nf-core repositories should always contain code from the latest release.
30+
Because of this, PRs to `master` are only allowed if they come from the ${{github.event.pull_request.head.repo.full_name}} `dev` branch.
31+
32+
You do not need to close this PR, you can change the target branch to `dev` by clicking the _"Edit"_ button at the top of this page.
33+
34+
Thanks again for your contribution!
35+
repo-token: ${{ secrets.GITHUB_TOKEN }}
36+
allow-repeats: false
37+

0 commit comments

Comments
 (0)