diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 4d46a3ac7..864af6938 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -10,14 +10,15 @@ Remember that PRs should be made against the dev branch, unless you're preparing
Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/eager/tree/master/.github/CONTRIBUTING.md)
-->
+
## PR checklist
- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- - [ ] If you've added a new tool - add to the software_versions process and a regex to `scrape_software_versions.py`
- - [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/eager/tree/master/.github/CONTRIBUTING.md)
- - [ ] If necessary, also make a PR on the nf-core/eager _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository.
+ - [ ] If you've added a new tool - add to the software_versions process and a regex to `scrape_software_versions.py`
+ - [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](nf-core/eager/tree/master/.github/CONTRIBUTING.md)
+ - [ ] If necessary, also make a PR on the nf-core/eager _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository.
- [ ] Make sure your code lints (`nf-core lint .`).
- [ ] Ensure the test suite passes (`nextflow run . -profile test,docker`).
- [ ] Usage Documentation in `docs/usage.md` is updated.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index c276df3d4..153a4befd 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -8,6 +8,9 @@ on:
release:
types: [published]
+# Uncomment if we need an edge release of Nextflow again
+# env: NXF_EDGE: 1
+
jobs:
test:
name: Run workflow tests
@@ -20,7 +23,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
- nxf_ver: ['20.07.1', '21.03.0-edge']
+ nxf_ver: ['20.07.1', '']
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
@@ -34,13 +37,13 @@ jobs:
- name: Build new docker image
if: env.MATCHED_FILES
- run: docker build --no-cache . -t nfcore/eager:2.3.4
+ run: docker build --no-cache . -t nfcore/eager:2.3.5
- name: Pull docker image
if: ${{ !env.MATCHED_FILES }}
run: |
docker pull nfcore/eager:dev
- docker tag nfcore/eager:dev nfcore/eager:2.3.4
+ docker tag nfcore/eager:dev nfcore/eager:2.3.5
- name: Install Nextflow
env:
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 79977f787..254ce499f 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -3,16 +3,40 @@
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
+## v2.3.5 - 2021-06-03
+
+### `Added`
+
+- [#722](https://github.com/nf-core/eager/issues/722) - Adds bwa `-o` flag for more flexibility in bwa parameters
+- [#736](https://github.com/nf-core/eager/issues/736) - Add printing of multiqc run report location on successful completion
+- New logo that is more visible when a user is using darkmode on GitHub or nf-core website!
+
+### `Fixed`
+
+- [#723](https://github.com/nf-core/eager/issues/723) - Fixes empty fields in TSV resulting in uninformative error
+- Updated template to nf-core/tools 1.14
+- [#688](https://github.com/nf-core/eager/issues/688) - Clarified the pipeline is not just for humans and microbes, but also plants and animals, and also for modern DNA
+- [#751](https://github.com/nf-core/eager/pull/751) - Added missing label to mtnucratio
+- General code cleanup and standardisation of parameters with no default setting
+- [#750](https://github.com/nf-core/eager/issues/750) - Fixed piped commands requesting the same number of CPUs at each command step
+- [#757](https://github.com/nf-core/eager/issues/757) - Removed confusing 'Data Type' variable from MultiQC workflow summary (not consistent with TSV input)
+- [#759](https://github.com/nf-core/eager/pull/759) - Fixed malformed software scraping regex that resulted in N/A in MultiQC report
+- [#761](https://github.com/nf-core/eager/pull/759) - Fixed issues related to instability of samtools filtering related CI tests
+
+### `Dependencies`
+
+### `Deprecated`
+
## v2.3.4 - 2021-05-05
### `Added`
-- [#729](https://github.com/nf-core/eager/issues/729) Added Bowtie2 flag `--maxins` for PE mapping modern DNA mapping contexts
+- [#729](https://github.com/nf-core/eager/issues/729) - Added Bowtie2 flag `--maxins` for PE mapping modern DNA mapping contexts
### `Fixed`
- Corrected explanation of the "--min_adap_overlap" parameter for AdapterRemoval in the docs
-- [#725](https://github.com/nf-core/eager/pull/725) `bwa_index` doc update
+- [#725](https://github.com/nf-core/eager/pull/725) - `bwa_index` doc update
- Re-adds gzip piping to AdapterRemovalFixPrefix to speed up process after reports of being very slow
- Updated DamageProfiler citation from bioRxiv to publication
diff --git a/Dockerfile b/Dockerfile
index 12e7f7ec2..fc295bd15 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,4 +1,4 @@
-FROM nfcore/base:1.13.3
+FROM nfcore/base:1.14
LABEL authors="The nf-core/eager community" \
description="Docker image containing all software requirements for the nf-core/eager pipeline"
@@ -7,11 +7,7 @@ COPY environment.yml /
RUN conda env create --quiet -f /environment.yml && conda clean -a
# Add conda installation dir to PATH (instead of doing 'conda activate')
-ENV PATH /opt/conda/envs/nf-core-eager-2.3.4/bin:$PATH
+ENV PATH /opt/conda/envs/nf-core-eager-2.3.5/bin:$PATH
# Dump the details of the installed packages to a file for posterity
-RUN conda env export --name nf-core-eager-2.3.4 > nf-core-eager-2.3.4.yml
-
-# Instruct R processes to use these empty files instead of clashing with a local version
-RUN touch .Rprofile
-RUN touch .Renviron
+RUN conda env export --name nf-core-eager-2.3.5 > nf-core-eager-2.3.5.yml
\ No newline at end of file
diff --git a/README.md b/README.md
index eae0a3761..193a88d96 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# 
+# 
**A fully reproducible and state-of-the-art ancient DNA analysis pipeline**.
@@ -17,7 +17,7 @@
## Introduction
-**nf-core/eager** is a bioinformatics best-practise analysis pipeline for NGS sequencing based ancient DNA (aDNA) data analysis.
+**nf-core/eager** is a scalable and reproducible bioinformatics best-practise processing pipeline for genomic NGS sequencing data, with a focus on ancient DNA (aDNA) data. It is ideal for the (palaeo)genomic analysis of humans, animals, plants, microbes and even microbiomes.
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible. The pipeline pre-processes raw data from FASTQ inputs, or preprocessed BAM inputs. It can align reads and performs extensive general NGS and aDNA specific quality-control on the results. It comes with docker, singularity or conda containers making installation trivial and results highly reproducible.
@@ -27,7 +27,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
## Quick Start
-1. Install [`nextflow`](https://nf-co.re/usage/installation) (version >= 20.04.0)
+1. Install [`nextflow`](https://nf-co.re/usage/installation) (`>=20.07.1`)
2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/), [`Podman`](https://podman.io/), [`Shifter`](https://nersc.gitlab.io/development/shifter/how-to-use/) or [`Charliecloud`](https://hpc.github.io/charliecloud/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_
diff --git a/assets/multiqc_config.yaml b/assets/multiqc_config.yaml
index 0d8c7c28a..060c92028 100644
--- a/assets/multiqc_config.yaml
+++ b/assets/multiqc_config.yaml
@@ -1,4 +1,4 @@
-custom_logo: 'nf-core_eager_logo.png'
+custom_logo: 'nf-core_eager_logo_outline_drop.png'
custom_logo_url: https://github.com/nf-core/eager/
custom_logo_title: 'nf-core/eager'
diff --git a/bin/scrape_software_versions.py b/bin/scrape_software_versions.py
index 5c9c0da9c..74d4ab0be 100755
--- a/bin/scrape_software_versions.py
+++ b/bin/scrape_software_versions.py
@@ -16,7 +16,7 @@
'Bowtie2': ['v_bowtie2.txt', r"bowtie2-([0-9]+\.[0-9]+\.[0-9]+) -fdebug"],
'Qualimap': ['v_qualimap.txt', r"QualiMap v.(\S+)"],
'GATK HaplotypeCaller': ['v_gatk.txt', r" v(\S+)"],
- #'GATK UnifiedGenotyper': ['v_gatk3_5.txt', r"version (\S+)"],
+ 'GATK UnifiedGenotyper': ['v_gatk3.txt', r"(\S+)"],
'bamUtil' : ['v_bamutil.txt', r"Version: (\S+);"],
'fastP': ['v_fastp.txt', r"([\d\.]+)"],
'DamageProfiler' : ['v_damageprofiler.txt', r"DamageProfiler v(\S+)"],
@@ -37,7 +37,7 @@
'kraken':['v_kraken.txt', r"Kraken version (\S+)"],
'eigenstrat_snp_coverage':['v_eigenstrat_snp_coverage.txt',r"(\S+)"],
'mapDamage2':['v_mapdamage.txt',r"(\S+)"],
- 'bbduk':['v_bbduk.txt',r"(\S+)"]
+ 'bbduk':['v_bbduk.txt',r"(.*)"]
}
results = OrderedDict()
diff --git a/docs/images/nf-core_eager_logo.svg b/docs/images/nf-core_eager_logo.svg
index 4171682c7..63cc0d638 100644
--- a/docs/images/nf-core_eager_logo.svg
+++ b/docs/images/nf-core_eager_logo.svg
@@ -1,6 +1,4 @@
-
-
\ No newline at end of file
+ transform="rotate(44.4504,182.60395,246.33729)"
+ id="g4559-8"
+ clip-path="url(#clipPath4643)"
+ style="fill:#918a6f"
+ inkscape:export-xdpi="268.66"
+ inkscape:export-ydpi="268.66">
diff --git a/docs/images/nf-core_eager_logo.png b/docs/images/nf-core_eager_logo_flat_black.png
similarity index 100%
rename from docs/images/nf-core_eager_logo.png
rename to docs/images/nf-core_eager_logo_flat_black.png
diff --git a/docs/images/nf-core_eager_logo_flat_black.svg b/docs/images/nf-core_eager_logo_flat_black.svg
new file mode 100644
index 000000000..63cc0d638
--- /dev/null
+++ b/docs/images/nf-core_eager_logo_flat_black.svg
@@ -0,0 +1,498 @@
+
+
diff --git a/docs/images/nf-core_eager_logo_flat_light.png b/docs/images/nf-core_eager_logo_flat_light.png
new file mode 100644
index 000000000..3cc07149c
Binary files /dev/null and b/docs/images/nf-core_eager_logo_flat_light.png differ
diff --git a/docs/images/nf-core_eager_logo_flat_light.svg b/docs/images/nf-core_eager_logo_flat_light.svg
new file mode 100644
index 000000000..9fbd41a36
--- /dev/null
+++ b/docs/images/nf-core_eager_logo_flat_light.svg
@@ -0,0 +1,498 @@
+
+
diff --git a/docs/images/nf-core_eager_logo_outline_drop.png b/docs/images/nf-core_eager_logo_outline_drop.png
new file mode 100644
index 000000000..cb7362f23
Binary files /dev/null and b/docs/images/nf-core_eager_logo_outline_drop.png differ
diff --git a/docs/images/nf-core_eager_logo_outline_drop.svg b/docs/images/nf-core_eager_logo_outline_drop.svg
new file mode 100644
index 000000000..18a829d5f
--- /dev/null
+++ b/docs/images/nf-core_eager_logo_outline_drop.svg
@@ -0,0 +1,1073 @@
+
+
diff --git a/docs/usage.md b/docs/usage.md
index 82ce25cc0..c683a473c 100644
--- a/docs/usage.md
+++ b/docs/usage.md
@@ -930,11 +930,11 @@ can use yourself, or upload alongside your publication for others to use.
To use the profile you just need to specify the file containing the profile you
wish to use, and then the profile itself.
-For example, Aida (Andrades Valtueña) on her cluster `sdag` at the MPI-SHH
-(`shh`) in Jena could run the following:
+For example, Aida (Andrades Valtueña) at the MPI-SHH (`shh`) in Jena could run
+the following:
```bash
-nextflow run nf-core/eager -c ///AndradesValtuena2018.config -profile shh,sdag,AndradesValtuena2018 --input '////' <...>
+nextflow run nf-core/eager -c ///AndradesValtuena2018.config -profile shh,AndradesValtuena2018 --input '////' <...>
```
Then a colleague at a different institution, such as the SciLifeLab, could run
@@ -1026,7 +1026,7 @@ running.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
<...>
```
@@ -1034,8 +1034,8 @@ nextflow run nf-core/eager \
For the `-profile` parameter, I have indicated that I wish to use Singularity as
my software container environment, and I will use the MPI-SHH institutional
config as listed on
-[nf-core/configs](https://github.com/nf-core/configs/blob/master/conf/shh.config),
- using the profile for the 'sdag' cluster. These profiles specify settings
+[nf-core/configs](https://github.com/nf-core/configs/blob/master/conf/shh.config).
+These profiles specify settings
optimised for the specific cluster/institution, such as maximum memory available
or which scheduler queues to submit to. More explanations about configs and
profiles can be seen in the [nf-core
@@ -1090,7 +1090,7 @@ FASTA file and the corresponding indices.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1115,7 +1115,7 @@ directory (which contains 'intermediate' working files and directories).
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \`
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1144,7 +1144,7 @@ string to be clipped.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1169,7 +1169,7 @@ with `--dedupper`.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1194,7 +1194,7 @@ and the reference.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1221,7 +1221,7 @@ unmapped reads.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1251,7 +1251,7 @@ fragment. We will therefore use `--bamutils_clip_half_udg_left` and
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1287,7 +1287,7 @@ you can download the file from [here](https://github.com/nf-core/test-datasets/b
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1321,7 +1321,7 @@ is simply named 'X'.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1362,7 +1362,7 @@ providing the name of the mitochondrial DNA contig in our reference genome with
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1404,7 +1404,7 @@ file of these sites that is specified with `--pileupcaller_snpfile`.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/hs37d5.fa' \
@@ -1646,7 +1646,7 @@ running.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_screening20200720' \
<...>
```
@@ -1654,8 +1654,8 @@ nextflow run nf-core/eager \
For the `-profile` parameter, I have indicated that I wish to use Singularity as
my software container environment, and I will use the MPI-SHH institutional
config as listed on
-[nf-core/configs](https://github.com/nf-core/configs/blob/master/conf/shh.config),
-and using the profile for the 'sdag' cluster. These profiles specify settings
+[nf-core/configs](https://github.com/nf-core/configs/blob/master/conf/shh.config).
+These profiles specify settings
optimised for the specific cluster/institution, such as maximum memory available
or which scheduler queues to submit to. More explanations about configs and
profiles can be seen in the [nf-core
@@ -1710,7 +1710,7 @@ FASTA file and the corresponding indices.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_screening20200720' \
--input 'screening20200720.tsv' \
--fasta '../Reference/genome/GRCh38.fa' \
@@ -1735,7 +1735,7 @@ directory (which contains 'intermediate' working files and directories).
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_screening20200720' \
--input 'screening20200720.tsv' \
--fasta '../Reference/genome/GRCh38.fa' \
@@ -1764,7 +1764,7 @@ string to be clipped.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_screening20200720' \
--input 'screening20200720.tsv' \
--fasta '../Reference/genome/GRCh38.fa' \
@@ -1785,7 +1785,7 @@ tell nf-core/eager what to do with the off target reads from the mapping.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_screening20200720' \
--input 'screening20200720.tsv' \
--fasta '../Reference/genome/GRCh38.fa' \
@@ -1815,7 +1815,7 @@ documentation describing each parameters can be seen in the usage
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_screening20200720' \
--input 'screening20200720.tsv' \
--fasta '../Reference/genome/GRCh38.fa' \
@@ -1842,7 +1842,7 @@ have indicators of true aDNA, we will run 'maltExtract' of the
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_screening20200720' \
--input 'screening20200720.tsv' \
--fasta '../Reference/genome/GRCh38.fa' \
@@ -2113,7 +2113,7 @@ running.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
<...>
```
@@ -2121,8 +2121,8 @@ nextflow run nf-core/eager \
For the `-profile` parameter, I have indicated that I wish to use Singularity as
my software container environment, and I will use the MPI-SHH institutional
config as listed on
-[nf-core/configs](https://github.com/nf-core/configs/blob/master/conf/shh.config),
-and using the profile for the 'sdag' cluster. These profiles specify settings
+[nf-core/configs](https://github.com/nf-core/configs/blob/master/conf/shh.config).
+These profiles specify settings
optimised for the specific cluster/institution, such as maximum memory available
or which scheduler queues to submit to. More explanations about configs and
profiles can be seen in the [nf-core
@@ -2174,7 +2174,7 @@ FASTA file and the corresponding indices.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2199,7 +2199,7 @@ directory (which contains 'intermediate' working files and directories).
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2228,7 +2228,7 @@ the default minimum length of a poly-G string to be clipped.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2252,7 +2252,7 @@ will do this with `--bwaalnn` and `--bwaalnl` respectively.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2276,7 +2276,7 @@ hard-drive footprint.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2306,7 +2306,7 @@ clarity.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2337,7 +2337,7 @@ often a custom BED file with just genes of interest is recommended. Furthermore
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2375,7 +2375,7 @@ we do BAM trimming instead here as another demonstration of functionality.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2416,7 +2416,7 @@ need to specify that we want to use the trimmed bams from the previous step.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
@@ -2459,7 +2459,7 @@ same settings and reference genome. We can do this as follows.
```bash
nextflow run nf-core/eager \
-r 2.2.0 \
--profile singularity,shh,sdag \
+-profile singularity,shh \
-name 'projectX_preprocessing20200727' \
--input 'preprocessing20200727.tsv' \
--fasta '../Reference/genome/Yersinia_pestis_C092_GCF_000009065.1_ASM906v1.fa' \
diff --git a/environment.yml b/environment.yml
index 45b750d2f..f752203a6 100644
--- a/environment.yml
+++ b/environment.yml
@@ -1,6 +1,6 @@
# You can use this file to create a conda environment for this pipeline:
# conda env create -f environment.yml
-name: nf-core-eager-2.3.4
+name: nf-core-eager-2.3.5
channels:
- conda-forge
- bioconda
@@ -49,4 +49,3 @@ dependencies:
- bioconda::eigenstratdatabasetools=1.0.2
- bioconda::mapdamage2=2.2.0
- bioconda::bbmap=38.87
-
diff --git a/lib/NfcoreSchema.groovy b/lib/NfcoreSchema.groovy
index 54935ec81..52ee73043 100644
--- a/lib/NfcoreSchema.groovy
+++ b/lib/NfcoreSchema.groovy
@@ -112,8 +112,14 @@ class NfcoreSchema {
}
// unexpected params
def params_ignore = params.schema_ignore_params.split(',') + 'schema_ignore_params'
- if (!expectedParams.contains(specifiedParam) && !params_ignore.contains(specifiedParam)) {
- unexpectedParams.push(specifiedParam)
+ def expectedParamsLowerCase = expectedParams.collect{ it.replace("-", "").toLowerCase() }
+ def specifiedParamLowerCase = specifiedParam.replace("-", "").toLowerCase()
+ if (!expectedParams.contains(specifiedParam) && !params_ignore.contains(specifiedParam) && !expectedParamsLowerCase.contains(specifiedParamLowerCase)) {
+ // Temporarily remove camelCase/camel-case params #1035
+ def unexpectedParamsLowerCase = unexpectedParams.collect{ it.replace("-", "").toLowerCase()}
+ if (!unexpectedParamsLowerCase.contains(specifiedParamLowerCase)){
+ unexpectedParams.push(specifiedParam)
+ }
}
}
@@ -191,11 +197,11 @@ class NfcoreSchema {
// Remove an element from a JSONArray
private static JSONArray removeElement(jsonArray, element){
- def list = []
+ def list = []
int len = jsonArray.length()
- for (int i=0;i output/${base}.pe.combined.fq.gz
+ AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz
"""
//PE mode, collapse and trim, outputting all reads, preserving 5p
} else if (seqtype == 'PE' && !params.skip_collapse && !params.skip_trim && !params.mergedonly && params.preserve5p) {
@@ -805,7 +804,7 @@ process adapter_removal {
mv *.settings output/
## Add R_ and L_ for unmerged reads for DeDup compatibility
- AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus} > output/${base}.pe.combined.fq.gz
+ AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz
"""
// PE mode, collapse and trim but only output collapsed reads
} else if ( seqtype == 'PE' && !params.skip_collapse && !params.skip_trim && params.mergedonly && !params.preserve5p ) {
@@ -816,7 +815,7 @@ process adapter_removal {
cat *.collapsed.gz *.collapsed.truncated.gz > output/${base}.pe.combined.tmp.fq.gz
## Add R_ and L_ for unmerged reads for DeDup compatibility
- AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus} > output/${base}.pe.combined.fq.gz
+ AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz
mv *.settings output/
"""
@@ -829,7 +828,7 @@ process adapter_removal {
cat *.collapsed.gz > output/${base}.pe.combined.tmp.fq.gz
## Add R_ and L_ for unmerged reads for DeDup compatibility
- AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus} > output/${base}.pe.combined.fq.gz
+ AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz
mv *.settings output/
"""
@@ -843,7 +842,7 @@ process adapter_removal {
cat *.collapsed.gz *.pair1.truncated.gz *.pair2.truncated.gz > output/${base}.pe.combined.tmp.fq.gz
## Add R_ and L_ for unmerged reads for DeDup compatibility
- AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus} > output/${base}.pe.combined.fq.gz
+ AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz
mv *.settings output/
"""
@@ -857,7 +856,7 @@ process adapter_removal {
cat *.collapsed.gz > output/${base}.pe.combined.tmp.fq.gz
## Add R_ and L_ for unmerged reads for DeDup compatibility
- AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus} > output/${base}.pe.combined.fq.gz
+ AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz
mv *.settings output/
"""
@@ -1160,16 +1159,16 @@ process bwa {
//PE data without merging, PE data without any AR applied
if ( seqtype == 'PE' && ( params.skip_collapse || params.skip_adapterremoval ) ){
"""
- bwa aln -t ${task.cpus} $fasta ${r1} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -f ${libraryid}.r1.sai
- bwa aln -t ${task.cpus} $fasta ${r2} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -f ${libraryid}.r2.sai
- bwa sampe -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $fasta ${libraryid}.r1.sai ${libraryid}.r2.sai ${r1} ${r2} | samtools sort -@ ${task.cpus} -O bam - > ${libraryid}_"${seqtype}".mapped.bam
+ bwa aln -t ${task.cpus} $fasta ${r1} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -o ${params.bwaalno} -f ${libraryid}.r1.sai
+ bwa aln -t ${task.cpus} $fasta ${r2} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -o ${params.bwaalno} -f ${libraryid}.r2.sai
+ bwa sampe -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $fasta ${libraryid}.r1.sai ${libraryid}.r2.sai ${r1} ${r2} | samtools sort -@ ${task.cpus - 1} -O bam - > ${libraryid}_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
} else {
- //PE collapsed, or SE data
+ //PE collapsed, or SE data
"""
- bwa aln -t ${task.cpus} ${fasta} ${r1} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -f ${libraryid}.sai
- bwa samse -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $fasta ${libraryid}.sai $r1 | samtools sort -@ ${task.cpus} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
+ bwa aln -t ${task.cpus} ${fasta} ${r1} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -o ${params.bwaalno} -f ${libraryid}.sai
+ bwa samse -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $fasta ${libraryid}.sai $r1 | samtools sort -@ ${task.cpus - 1} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
}
@@ -1194,17 +1193,18 @@ process bwamem {
params.mapper == 'bwamem'
script:
+ def split_cpus = Math.floor(task.cpus/2)
def fasta = "${index}/${fasta_base}"
def size = params.large_ref ? '-c' : ''
if (!params.single_end && params.skip_collapse){
"""
- bwa mem -t ${task.cpus} $fasta $r1 $r2 -R "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" | samtools sort -@ ${task.cpus} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
+ bwa mem -t ${split_cpus} $fasta $r1 $r2 -R "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" | samtools sort -@ ${split_cpus} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
samtools index ${size} -@ ${task.cpus} "${libraryid}".mapped.bam
"""
} else {
"""
- bwa mem -t ${task.cpus} $fasta $r1 -R "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" | samtools sort -@ ${task.cpus} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
+ bwa mem -t ${split_cpus} $fasta $r1 -R "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${libraryid}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" | samtools sort -@ ${split_cpus} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
samtools index -@ ${task.cpus} "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
}
@@ -1302,6 +1302,7 @@ process bowtie2 {
params.mapper == 'bowtie2'
script:
+ def split_cpus = Math.floor(task.cpus/2)
def size = params.large_ref ? '-c' : ''
def fasta = "${index}/${fasta_base}"
def trim5 = params.bt2_trim5 != 0 ? "--trim5 ${params.bt2_trim5}" : ""
@@ -1345,13 +1346,13 @@ process bowtie2 {
//PE data without merging, PE data without any AR applied
if ( seqtype == 'PE' && ( params.skip_collapse || params.skip_adapterremoval ) ){
"""
- bowtie2 -x ${fasta} -1 ${r1} -2 ${r2} -p ${task.cpus} ${sensitivity} ${bt2n} ${bt2l} ${trim5} ${trim3} --maxins ${params.bt2_maxins} --rg-id ILLUMINA-${libraryid} --rg SM:${libraryid} --rg PL:illumina --rg PU:ILLUMINA-${libraryid}-${seqtype} 2> "${libraryid}"_bt2.log | samtools sort -@ ${task.cpus} -O bam > "${libraryid}"_"${seqtype}".mapped.bam
+ bowtie2 -x ${fasta} -1 ${r1} -2 ${r2} -p ${split_cpus} ${sensitivity} ${bt2n} ${bt2l} ${trim5} ${trim3} --maxins ${params.bt2_maxins} --rg-id ILLUMINA-${libraryid} --rg SM:${libraryid} --rg PL:illumina --rg PU:ILLUMINA-${libraryid}-${seqtype} 2> "${libraryid}"_bt2.log | samtools sort -@ ${split_cpus} -O bam > "${libraryid}"_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
} else {
//PE collapsed, or SE data
"""
- bowtie2 -x ${fasta} -U ${r1} -p ${task.cpus} ${sensitivity} ${bt2n} ${bt2l} ${trim5} ${trim3} --rg-id ILLUMINA-${libraryid} --rg SM:${libraryid} --rg PL:illumina --rg PU:ILLUMINA-${libraryid}-${seqtype} 2> "${libraryid}"_bt2.log | samtools sort -@ ${task.cpus} -O bam > "${libraryid}"_"${seqtype}".mapped.bam
+ bowtie2 -x ${fasta} -U ${r1} -p ${split_cpus} ${sensitivity} ${bt2n} ${bt2l} ${trim5} ${trim3} --rg-id ILLUMINA-${libraryid} --rg SM:${libraryid} --rg PL:illumina --rg PU:ILLUMINA-${libraryid}-${seqtype} 2> "${libraryid}"_bt2.log | samtools sort -@ ${split_cpus} -O bam > "${libraryid}"_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
}
@@ -1538,87 +1539,87 @@ process samtools_filter {
tuple samplename, libraryid, lane, seqtype, organism, strandedness, udg, file("*.unmapped.fastq.gz") optional true into ch_bam_filtering_for_metagenomic,ch_metagenomic_for_skipentropyfilter
tuple samplename, libraryid, lane, seqtype, organism, strandedness, udg, file("*.unmapped.bam") optional true
- // Using shell block rather than script because we are playing with awk
- shell:
- size = !{params.large_ref} ? '-c' : ''
+ script:
+
+ def size = params.large_ref ? '-c' : ''
// Unmapped/MAPQ Filtering WITHOUT min-length filtering
if ( "${params.bam_unmapped_type}" == "keep" && params.bam_filter_minreadlength == 0 ) {
- '''
- samtools view -h -b !{bam} -@ !{task.cpus} -q !{params.bam_mapping_quality_threshold} -o !{libraryid}.filtered.bam
- samtools index !{libraryid}.filtered.bam !{size}
- '''
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
+ samtools index ${libraryid}.filtered.bam ${size}
+ """
} else if ( "${params.bam_unmapped_type}" == "discard" && params.bam_filter_minreadlength == 0 ){
- '''
- samtools view -h -b !{bam} -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o !{libraryid}.filtered.bam
- samtools index !{libraryid}.filtered.bam !{size}
- '''
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
+ samtools index ${libraryid}.filtered.bam ${size}
+ """
} else if ( "${params.bam_unmapped_type}" == "bam" && params.bam_filter_minreadlength == 0 ){
- '''
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -f4 -o !{libraryid}.unmapped.bam
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o !{libraryid}.filtered.bam
- samtools index !{libraryid}.filtered.bam !{size}
- '''
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
+ samtools index ${libraryid}.filtered.bam ${size}
+ """
} else if ( "${params.bam_unmapped_type}" == "fastq" && params.bam_filter_minreadlength == 0 ){
- '''
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -f4 -o !{libraryid}.unmapped.bam
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o !{libraryid}.filtered.bam
- samtools index !{libraryid}.filtered.bam !{size}
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
+ samtools index ${libraryid}.filtered.bam ${size}
## FASTQ
- samtools fastq -tn !{libraryid}.unmapped.bam | pigz -p !{task.cpus} > !{libraryid}.unmapped.fastq.gz
- rm !{libraryid}.unmapped.bam
- '''
+ samtools fastq -tn ${libraryid}.unmapped.bam | pigz -p ${task.cpus - 1} > ${libraryid}.unmapped.fastq.gz
+ rm ${libraryid}.unmapped.bam
+ """
} else if ( "${params.bam_unmapped_type}" == "both" && params.bam_filter_minreadlength == 0 ){
- '''
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -f4 -o !{libraryid}.unmapped.bam
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o !{libraryid}.filtered.bam
- samtools index !{libraryid}.filtered.bam !{size}
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
+ samtools index ${libraryid}.filtered.bam ${size}
## FASTQ
- samtools fastq -tn !{libraryid}.unmapped.bam | pigz -p !{task.cpus} > !{libraryid}.unmapped.fastq.gz
- '''
+ samtools fastq -tn ${libraryid}.unmapped.bam | pigz -p ${task.cpus -1} > ${libraryid}.unmapped.fastq.gz
+ """
// Unmapped/MAPQ Filtering WITH min-length filtering
} else if ( "${params.bam_unmapped_type}" == "keep" && params.bam_filter_minreadlength != 0 ) {
- '''
- samtools view -h -b !{bam} -@ !{task.cpus} -q !{params.bam_mapping_quality_threshold} -o tmp_mapped.bam
- filter_bam_fragment_length.py -a -l !{params.bam_filter_minreadlength} -o !{libraryid} tmp_mapped.bam
- samtools index !{libraryid}.filtered.bam !{size}
- '''
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
+ filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
+ samtools index ${libraryid}.filtered.bam ${size}
+ """
} else if ( "${params.bam_unmapped_type}" == "discard" && params.bam_filter_minreadlength != 0 ){
- '''
- samtools view -h -b !{bam} -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o tmp_mapped.bam
- filter_bam_fragment_length.py -a -l !{params.bam_filter_minreadlength} -o !{libraryid} tmp_mapped.bam
- samtools index !{libraryid}.filtered.bam !{size}
- '''
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
+ filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
+ samtools index ${libraryid}.filtered.bam ${size}
+ """
} else if ( "${params.bam_unmapped_type}" == "bam" && params.bam_filter_minreadlength != 0 ){
- '''
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -f4 -o !{libraryid}.unmapped.bam
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o tmp_mapped.bam
- filter_bam_fragment_length.py -a -l !{params.bam_filter_minreadlength} -o !{libraryid} tmp_mapped.bam
- samtools index !{libraryid}.filtered.bam !{size}
- '''
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
+ filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
+ samtools index ${libraryid}.filtered.bam ${size}
+ """
} else if ( "${params.bam_unmapped_type}" == "fastq" && params.bam_filter_minreadlength != 0 ){
- '''
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -f4 -o !{libraryid}.unmapped.bam
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o tmp_mapped.bam
- filter_bam_fragment_length.py -a -l !{params.bam_filter_minreadlength} -o !{libraryid} tmp_mapped.bam
- samtools index !{libraryid}.filtered.bam !{size}
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
+ filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
+ samtools index ${libraryid}.filtered.bam ${size}
## FASTQ
- samtools fastq -tn !{libraryid}.unmapped.bam | pigz -p !{task.cpus} > !{libraryid}.unmapped.fastq.gz
- rm !{libraryid}.unmapped.bam
- '''
+ samtools fastq -tn ${libraryid}.unmapped.bam | pigz -p ${task.cpus - 1} > ${libraryid}.unmapped.fastq.gz
+ rm ${libraryid}.unmapped.bam
+ """
} else if ( "${params.bam_unmapped_type}" == "both" && params.bam_filter_minreadlength != 0 ){
- '''
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -f4 -o !{libraryid}.unmapped.bam
- samtools view -h !{bam} | samtools view - -@ !{task.cpus} -F4 -q !{params.bam_mapping_quality_threshold} -o tmp_mapped.bam
- filter_bam_fragment_length.py -a -l !{params.bam_filter_minreadlength} -o !{libraryid} tmp_mapped.bam
- samtools index !{libraryid}.filtered.bam !{size}
+ """
+ samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
+ samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
+ filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
+ samtools index ${libraryid}.filtered.bam ${size}
## FASTQ
- samtools fastq -tn !{libraryid}.unmapped.bam | pigz -p !{task.cpus} > !{libraryid}.unmapped.fastq.gz
- '''
+ samtools fastq -tn ${libraryid}.unmapped.bam | pigz -p ${task.cpus} > ${libraryid}.unmapped.fastq.gz
+ """
}
}
@@ -1936,8 +1937,8 @@ process bedtools {
script:
"""
- bedtools coverage -nonamecheck -a ${anno_file} -b $bam | pigz -p ${task.cpus} > "${bam.baseName}".breadth.gz
- bedtools coverage -nonamecheck -a ${anno_file} -b $bam -mean | pigz -p ${task.cpus} > "${bam.baseName}".depth.gz
+ bedtools coverage -nonamecheck -a ${anno_file} -b $bam | pigz -p ${task.cpus - 1} > "${bam.baseName}".breadth.gz
+ bedtools coverage -nonamecheck -a ${anno_file} -b $bam -mean | pigz -p ${task.cpus - 1} > "${bam.baseName}".depth.gz
"""
}
@@ -2006,7 +2007,7 @@ process mapdamage_rescaling {
// Optionally perform further aDNA evaluation or filtering for just reads with damage etc.
process pmdtools {
- label 'mc_small'
+ label 'mc_medium'
tag "${libraryid}"
publishDir "${params.outdir}/pmdtools", mode: params.publish_dir_mode
@@ -2023,8 +2024,8 @@ process pmdtools {
script:
//Check which treatment for the libraries was used
def treatment = udg ? (udg == 'half' ? '--UDGhalf' : '--CpG') : '--UDGminus'
- if(params.snpcapture_bed != ''){
- snpcap = (params.pmdtools_reference_mask != '') ? "--refseq ${params.pmdtools_reference_mask}" : ''
+ if(params.snpcapture_bed){
+ snpcap = (params.pmdtools_reference_mask) ? "--refseq ${params.pmdtools_reference_mask}" : ''
log.info"######No reference mask specified for PMDtools, therefore ignoring that for downstream analysis!"
} else {
snpcap = ''
@@ -2033,14 +2034,13 @@ process pmdtools {
def platypus = params.pmdtools_platypus ? '--platypus' : ''
"""
#Run Filtering step
- samtools calmd -b ${bam} ${fasta} | samtools view -h - | pmdtools --threshold ${params.pmdtools_threshold} ${treatment} ${snpcap} --header | samtools view -@ ${task.cpus} -Sb - > "${libraryid}".pmd.bam
+ samtools calmd ${bam} ${fasta} | pmdtools --threshold ${params.pmdtools_threshold} ${treatment} ${snpcap} --header | samtools view -Sb - > "${libraryid}".pmd.bam
#Run Calc Range step
## To allow early shut off of pipe: https://github.com/nextflow-io/nextflow/issues/1564
trap 'if [[ \$? == 141 ]]; then echo "Shutting samtools early due to -n parameter" && samtools index ${libraryid}.pmd.bam ${size}; exit 0; fi' EXIT
- samtools calmd -b ${bam} ${fasta} | samtools view -h - | pmdtools --deamination ${platypus} --range ${params.pmdtools_range} ${treatment} ${snpcap} -n ${params.pmdtools_max_reads} > "${libraryid}".cpg.range."${params.pmdtools_range}".txt
+ samtools calmd ${bam} ${fasta} | pmdtools --deamination ${platypus} --range ${params.pmdtools_range} ${treatment} ${snpcap} -n ${params.pmdtools_max_reads} > "${libraryid}".cpg.range."${params.pmdtools_range}".txt
- echo "Running indexing"
samtools index ${libraryid}.pmd.bam ${size}
"""
}
@@ -2163,7 +2163,7 @@ process qualimap {
tuple samplename, libraryid, lane, seqtype, organism, strandedness, udg, path("*") into ch_qualimap_results
script:
- def snpcap = params.snpcapture_bed != '' ? "-gff ${params.snpcapture_bed}" : ''
+ def snpcap = params.snpcapture_bed ? "-gff ${params.snpcapture_bed}" : ''
"""
qualimap bamqc -bam $bam -nt ${task.cpus} -outdir . -outformat "HTML" ${snpcap} --java-mem-size=${task.memory.toGiga()}G
"""
@@ -2234,9 +2234,9 @@ process genotyping_ug {
tuple samplename, libraryid, lane, seqtype, organism, strandedness, udg, file("*.realign.{bam,bai}") optional true
script:
- def defaultbasequalities = params.gatk_ug_defaultbasequalities == '' ? '' : " --defaultBaseQualities ${params.gatk_ug_defaultbasequalities}"
+ def defaultbasequalities = !params.gatk_ug_defaultbasequalities ? '' : " --defaultBaseQualities ${params.gatk_ug_defaultbasequalities}"
def keep_realign = params.gatk_ug_keep_realign_bam ? "samtools index ${samplename}.realign.bam" : "rm ${samplename}.realign.{bam,bai}"
- if (params.gatk_dbsnp == '')
+ if (!params.gatk_dbsnp)
"""
samtools index -b ${bam}
gatk3 -T RealignerTargetCreator -R ${fasta} -I ${bam} -nt ${task.cpus} -o ${samplename}.intervals ${defaultbasequalities}
@@ -2247,7 +2247,7 @@ process genotyping_ug {
pigz -p ${task.cpus} ${samplename}.unifiedgenotyper.vcf
"""
- else if (params.gatk_dbsnp != '')
+ else if (params.gatk_dbsnp)
"""
samtools index ${bam}
gatk3 -T RealignerTargetCreator -R ${fasta} -I ${bam} -nt ${task.cpus} -o ${samplename}.intervals ${defaultbasequalities}
@@ -2280,13 +2280,13 @@ process genotyping_hc {
tuple samplename, libraryid, lane, seqtype, organism, strandedness, udg, path("*vcf.gz")
script:
- if (params.gatk_dbsnp == '')
+ if (!params.gatk_dbsnp)
"""
gatk HaplotypeCaller -R ${fasta} -I ${bam} -O ${samplename}.haplotypecaller.vcf -stand-call-conf ${params.gatk_call_conf} --sample-ploidy ${params.gatk_ploidy} --output-mode ${params.gatk_hc_out_mode} --emit-ref-confidence ${params.gatk_hc_emitrefconf}
pigz -p ${task.cpus} ${samplename}.haplotypecaller.vcf
"""
- else if (params.gatk_dbsnp != '')
+ else if (params.gatk_dbsnp)
"""
gatk HaplotypeCaller -R ${fasta} -I ${bam} -O ${samplename}.haplotypecaller.vcf --dbsnp ${params.gatk_dbsnp} -stand-call-conf ${params.gatk_call_conf} --sample_ploidy ${params.gatk_ploidy} --output_mode ${params.gatk_hc_out_mode} --emit-ref-confidence ${params.gatk_hc_emitrefconf}
pigz -p ${task.cpus} ${samplename}.haplotypecaller.vcf
@@ -2470,8 +2470,8 @@ process vcf2genome {
tuple samplename, libraryid, lane, seqtype, organism, strandedness, udg, path("*.fasta.gz")
script:
- def out = "${params.vcf2genome_outfile}" == '' ? "${samplename}.fasta" : "${params.vcf2genome_outfile}"
- def fasta_head = "${params.vcf2genome_header}" == '' ? "${samplename}" : "${params.vcf2genome_header}"
+ def out = !params.vcf2genome_outfile ? "${samplename}.fasta" : "${params.vcf2genome_outfile}"
+ def fasta_head = !params.vcf2genome_header ? "${samplename}" : "${params.vcf2genome_header}"
"""
pigz -f -d -p ${task.cpus} *.vcf.gz
vcf2genome -Xmx${task.memory.toGiga()}g -draft ${out}.fasta -draftname "${fasta_head}" -in ${vcf.baseName} -minc ${params.vcf2genome_minc} -minfreq ${params.vcf2genome_minfreq} -minq ${params.vcf2genome_minq} -ref ${fasta} -refMod ${out}_refmod.fasta -uncertain ${out}_uncertainy.fasta
@@ -2529,6 +2529,7 @@ process multivcfanalyzer {
// Mitochondrial to nuclear ratio helps to evaluate quality of tissue sampled
process mtnucratio {
+ label 'sc_small'
tag "${samplename}"
publishDir "${params.outdir}/mtnucratio", mode: params.publish_dir_mode
@@ -2572,7 +2573,7 @@ process sexdeterrmine_prep {
// As we collect all files for a single sex_deterrmine run, we DO NOT use the normal input/output tuple
process sexdeterrmine {
- label 'sc_small'
+ label 'mc_small'
publishDir "${params.outdir}/sex_determination", mode: params.publish_dir_mode
input:
@@ -2691,7 +2692,7 @@ if (params.metagenomic_tool == 'malt') {
.set {ch_input_for_metagenomic_kraken}
ch_input_for_metagenomic_malt = Channel.empty()
-} else if ( params.metagenomic_tool == '' ) {
+} else if ( !params.metagenomic_tool ) {
ch_input_for_metagenomic_malt = Channel.empty()
ch_input_for_metagenomic_kraken = Channel.empty()
@@ -2811,7 +2812,7 @@ if (params.run_metagenomic_screening && params.database.endsWith(".tar.gz") && p
"""
}
-} else if (! params.database.endsWith(".tar.gz") && params.run_metagenomic_screening && params.metagenomic_tool == 'kraken') {
+} else if (params.database && ! params.database.endsWith(".tar.gz") && params.run_metagenomic_screening && params.metagenomic_tool == 'kraken') {
ch_krakendb = Channel.fromPath(params.database).first()
} else {
ch_krakendb = Channel.empty()
@@ -2908,7 +2909,7 @@ process output_documentation {
*/
process get_software_versions {
- label 'sc_tiny'
+ label 'mc_small'
publishDir "${params.outdir}/pipeline_info", mode: params.publish_dir_mode,
saveAs: { filename ->
if (filename.indexOf(".csv") > 0) filename
@@ -2936,6 +2937,7 @@ process get_software_versions {
qualimap --version &> v_qualimap.txt 2>&1 || true
preseq &> v_preseq.txt 2>&1 || true
gatk --version 2>&1 | head -n 1 > v_gatk.txt 2>&1 || true
+ gatk3 --version 2>&1 > v_gatk3.txt 2>&1 || true
freebayes --version &> v_freebayes.txt 2>&1 || true
bedtools --version &> v_bedtools.txt 2>&1 || true
damageprofiler --version &> v_damageprofiler.txt 2>&1 || true
@@ -2954,7 +2956,7 @@ process get_software_versions {
pileupCaller --version &> v_sequencetools.txt 2>&1 || true
bowtie2 --version | grep -a 'bowtie2-.* -fdebug' > v_bowtie2.txt || true
eigenstrat_snp_coverage --version | cut -d ' ' -f2 >v_eigenstrat_snp_coverage.txt || true
- mapDamage2 --version > v_mapdamage.txt || true
+ mapDamage --version > v_mapdamage.txt || true
bbduk.sh | grep 'Last modified' | cut -d' ' -f 3-99 > v_bbduk.txt || true
scrape_software_versions.py &> software_versions_mqc.yaml
@@ -3128,6 +3130,7 @@ workflow.onComplete {
if (workflow.success) {
log.info "-${c_purple}[nf-core/eager]${c_green} Pipeline completed successfully${c_reset}-"
+ log.info "-${c_purple}[nf-core/eager]${c_green} MultiQC run report can be found in ${params.outdir}/multiqc ${c_reset}-"
} else {
checkHostname()
log.info "-${c_purple}[nf-core/eager]${c_red} Pipeline completed with errors${c_reset}-"
@@ -3157,17 +3160,17 @@ def extract_data(tsvFile) {
checkNumberOfItem(row, 11)
- if ( row.Sample_Name.isEmpty() ) exit 1, "[nf-core/eager] error: the Sample_Name column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.Library_ID.isEmpty() ) exit 1, "[nf-core/eager] error: the Library_ID column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.Lane.isEmpty() ) exit 1, "[nf-core/eager] error: the Lane column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.Colour_Chemistry.isEmpty() ) exit 1, "[nf-core/eager] error: the Colour_Chemistry column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.SeqType.isEmpty() ) exit 1, "[nf-core/eager] error: the SeqType column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.Organism.isEmpty() ) exit 1, "[nf-core/eager] error: the Organism column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.Strandedness.isEmpty() ) exit 1, "[nf-core/eager] error: the Strandedness column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.UDG_Treatment.isEmpty() ) exit 1, "[nf-core/eager] error: the UDG_Treatment column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.R1.isEmpty() ) exit 1, "[nf-core/eager] error: the R1 column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.R2.isEmpty() ) exit 1, "[nf-core/eager] error: the R2 column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
- if ( row.BAM.isEmpty() ) exit 1, "[nf-core/eager] error: the BAM column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.Sample_Name == null || row.Sample_Name.isEmpty() ) exit 1, "[nf-core/eager] error: the Sample_Name column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.Library_ID == null || row.Library_ID.isEmpty() ) exit 1, "[nf-core/eager] error: the Library_ID column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.Lane == null || row.Lane.isEmpty() ) exit 1, "[nf-core/eager] error: the Lane column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.Colour_Chemistry == null || row.Colour_Chemistry.isEmpty() ) exit 1, "[nf-core/eager] error: the Colour_Chemistry column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.SeqType == null || row.SeqType.isEmpty() ) exit 1, "[nf-core/eager] error: the SeqType column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.Organism == null || row.Organism.isEmpty() ) exit 1, "[nf-core/eager] error: the Organism column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.Strandedness == null || row.Strandedness.isEmpty() ) exit 1, "[nf-core/eager] error: the Strandedness column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.UDG_Treatment == null || row.UDG_Treatment.isEmpty() ) exit 1, "[nf-core/eager] error: the UDG_Treatment column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.R1 == null || row.R1.isEmpty() ) exit 1, "[nf-core/eager] error: the R1 column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.R2 == null || row.R2.isEmpty() ) exit 1, "[nf-core/eager] error: the R2 column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
+ if ( row.BAM == null || row.BAM.isEmpty() ) exit 1, "[nf-core/eager] error: the BAM column is empty. Ensure all cells are filled or contain 'NA' for optional fields. Check row:\n ${row}"
def samplename = row.Sample_Name
def libraryid = row.Library_ID
@@ -3319,11 +3322,11 @@ def checkHostname() {
params.hostnames.each { prof, hnames ->
hnames.each { hname ->
if (hostname.contains(hname) && !workflow.profile.contains(prof)) {
- log.error '====================================================\n' +
+ log.error "${c_red}====================================================${c_reset}\n" +
" ${c_red}WARNING!${c_reset} You are running with `-profile $workflow.profile`\n" +
" but your machine hostname is ${c_white}'$hostname'${c_reset}\n" +
" ${c_yellow_bold}It's highly recommended that you use `-profile $prof${c_reset}`\n" +
- '============================================================'
+ "${c_red}====================================================${c_reset}\n"
}
}
}
diff --git a/nextflow.config b/nextflow.config
index 2533ea38b..db0a875d2 100644
--- a/nextflow.config
+++ b/nextflow.config
@@ -14,12 +14,12 @@ params {
single_end = false
outdir = './results'
publish_dir_mode = 'copy'
- config_profile_name = ''
+ config_profile_name = null
// aws
- awsqueue = ''
+ awsqueue = null
awsregion = 'eu-west-1'
- awscli = ''
+ awscli = null
//Pipeline options
enable_conda = false
@@ -35,15 +35,15 @@ params {
bam = false
// Optional input information
- snpcapture_bed = ''
+ snpcapture_bed = null
run_convertinputbam = false
//Input reference
- fasta = ''
- bwa_index = ''
- bt2_index = ''
- fasta_index = ''
- seq_dict = ''
+ fasta = null
+ bwa_index = null
+ bt2_index = null
+ fasta_index = null
+ seq_dict = null
large_ref = false
save_reference = false
@@ -80,6 +80,7 @@ params {
bwaalnn = 0.04
bwaalnk = 2
bwaalnl = 1024 // From Schubert et al. 2012 (10.1186/1471-2164-13-178)
+ bwaalno = 1 // leave at bwa default for now
circularextension = 500
circulartarget = 'MT'
circularfilter = false
@@ -117,7 +118,7 @@ params {
run_pmdtools = false
pmdtools_range = 10
pmdtools_threshold = 3
- pmdtools_reference_mask = ''
+ pmdtools_reference_mask = null
pmdtools_max_reads = 10000
pmdtools_platypus = false
@@ -128,7 +129,7 @@ params {
//Bedtools settings
run_bedtools_coverage = false
- anno_file = ''
+ anno_file = null
//bamUtils trimbam settings
run_trim_bam = false
@@ -140,26 +141,26 @@ params {
//Genotyping options
run_genotyping = false
- genotyping_tool = ''
+ genotyping_tool = null
genotyping_source = 'raw'
// gatk options
gatk_call_conf = 30
gatk_ploidy = 2
gatk_downsample = 250
- gatk_dbsnp = ''
+ gatk_dbsnp = null
gatk_hc_out_mode = 'EMIT_VARIANTS_ONLY'
gatk_hc_emitrefconf = 'GVCF'
gatk_ug_genotype_model = 'SNP'
gatk_ug_out_mode = 'EMIT_VARIANTS_ONLY'
gatk_ug_keep_realign_bam = false
- gatk_ug_defaultbasequalities = ''
+ gatk_ug_defaultbasequalities = null
// freebayes options
freebayes_C = 1
freebayes_g = 0
freebayes_p = 2
// Sequencetools pileupCaller
- pileupcaller_snpfile = ''
- pileupcaller_bedfile = ''
+ pileupcaller_snpfile = null
+ pileupcaller_bedfile = null
pileupcaller_method = 'randomHaploid'
pileupcaller_transitions_mode = 'AllSites'
// ANGSD Genotype Likelihoods
@@ -183,7 +184,7 @@ params {
min_base_coverage = 5
min_allele_freq_hom = 0.9
min_allele_freq_het = 0.9
- additional_vcf_files = ''
+ additional_vcf_files = null
reference_gff_annotations = 'NA'
reference_gff_exclude = 'NA'
snp_eff_results = 'NA'
@@ -194,7 +195,7 @@ params {
//Sex.DetERRmine settings
run_sexdeterrmine = false
- sexdeterrmine_bedfile = ''
+ sexdeterrmine_bedfile = null
//Nuclear contamination based on chromosome X heterozygosity.
run_nuclear_contamination = false
@@ -206,8 +207,8 @@ params {
metagenomic_complexity_filter = false
metagenomic_complexity_entropy = 0.3
- metagenomic_tool = ''
- database = ''
+ metagenomic_tool = null
+ database = null
metagenomic_min_support_reads = 1
percent_identity = 85
malt_mode = 'BlastN'
@@ -222,8 +223,8 @@ params {
// maltextract - only including number
// parameters if default documented or duplicate of MALT
run_maltextract = false
- maltextract_taxon_list = ''
- maltextract_ncbifiles = ''
+ maltextract_taxon_list = null
+ maltextract_ncbifiles = null
maltextract_filter = 'def_anc'
maltextract_toppercent = 0.01
maltextract_destackingoff = false
@@ -242,12 +243,13 @@ params {
plaintext_email = false
monochrome_logs = false
help = false
- igenomes_base = 's3://ngi-igenomes/igenomes/'
+ igenomes_base = 's3://ngi-igenomes/igenomes'
tracedir = "${params.outdir}/pipeline_info"
igenomes_ignore = true
custom_config_version = 'master'
custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}"
hostnames = false
+ config_profile_name = null
config_profile_description = false
config_profile_contact = false
config_profile_url = false
@@ -264,7 +266,7 @@ params {
// Container slug. Stable releases should specify release tag!
// Developmental code should specify :dev
-process.container = 'nfcore/eager:2.3.4'
+process.container = 'nfcore/eager:2.3.5'
// Load base.config by default for all pipelines
includeConfig 'conf/base.config'
@@ -289,7 +291,7 @@ profiles {
singularity.enabled = false
podman.enabled = false
shifter.enabled = false
- charliecloud = false
+ charliecloud.enabled = false
process.conda = "$projectDir/environment.yml"
}
debug { process.beforeScript = 'echo $HOSTNAME' }
@@ -318,7 +320,7 @@ profiles {
docker.enabled = false
podman.enabled = true
shifter.enabled = false
- charliecloud = false
+ charliecloud.enabled = false
}
shifter {
singularity.enabled = false
@@ -369,21 +371,22 @@ env {
// Capture exit codes from upstream processes when piping
process.shell = ['/bin/bash', '-euo', 'pipefail']
+def trace_timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')
timeline {
enabled = true
- file = "${params.tracedir}/execution_timeline.html"
+ file = "${params.tracedir}/execution_timeline_${trace_timestamp}.html"
}
report {
enabled = true
- file = "${params.tracedir}/execution_report.html"
+ file = "${params.tracedir}/execution_report_${trace_timestamp}.html"
}
trace {
enabled = true
- file = "${params.tracedir}/execution_trace.txt"
+ file = "${params.tracedir}/execution_trace_${trace_timestamp}.txt"
}
dag {
enabled = true
- file = "${params.tracedir}/pipeline_dag.svg"
+ file = "${params.tracedir}/pipeline_dag_${trace_timestamp}.svg"
}
manifest {
@@ -392,8 +395,8 @@ manifest {
homePage = 'https://github.com/nf-core/eager'
description = 'A fully reproducible and state-of-the-art ancient DNA analysis pipeline'
mainScript = 'main.nf'
- nextflowVersion = '!>=20.07.1'
- version = '2.3.4'
+ nextflowVersion = '>=20.07.1'
+ version = '2.3.5'
}
// Function to ensure that resource requirements don't go beyond
@@ -427,4 +430,4 @@ def check_max(obj, type) {
return obj
}
}
-}
+}
\ No newline at end of file
diff --git a/nextflow_schema.json b/nextflow_schema.json
index 26a2fbf0f..64814061c 100644
--- a/nextflow_schema.json
+++ b/nextflow_schema.json
@@ -102,7 +102,7 @@
"igenomes_base": {
"type": "string",
"description": "Directory / URL base for iGenomes references.",
- "default": "s3://ngi-igenomes/igenomes/",
+ "default": "s3://ngi-igenomes/igenomes",
"fa_icon": "fas fa-cloud-download-alt",
"hidden": true
},
@@ -296,7 +296,7 @@
"description": "Maximum amount of memory that can be requested for any single job.",
"default": "128.GB",
"fa_icon": "fas fa-memory",
- "pattern": "^[\\d\\.]+\\s*.(K|M|G|T)?B$",
+ "pattern": "^\\d+(\\.\\d+)?\\.?\\s*(K|M|G|T)?B$",
"hidden": true,
"help_text": "Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. `--max_memory '8.GB'`"
},
@@ -305,7 +305,7 @@
"description": "Maximum amount of time that can be requested for any single job.",
"default": "240.h",
"fa_icon": "far fa-clock",
- "pattern": "^(\\d+(\\.\\d+)?(?:\\s*|\\.?)(s|m|h|d)\\s*)+$",
+ "pattern": "^(\\d+\\.?\\s*(s|m|h|day)\\s*)+$",
"hidden": true,
"help_text": "Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. `--max_time '2.h'`"
}
@@ -562,6 +562,13 @@
"fa_icon": "fas fa-ruler-horizontal",
"help_text": "Configures the length of the seed used in `bwa aln -l`. Default is set to be 'turned off' at the recommendation of Schubert et al. ([2012 _BMC Genomics_](https://doi.org/10.1186/1471-2164-13-178)) for ancient DNA with `1024`.\n\nNote: Despite being recommended, turning off seeding can result in long runtimes!\n\n> Modifies BWA aln parameter: `-l`\n"
},
+ "bwaalno": {
+ "type": "integer",
+ "default": 1,
+ "fa_icon": "fas fa-people-arrows",
+ "description": "Specify the -o parameter for BWA aln i.e. the number of gaps allowed.",
+ "help_text": "Configures the number of gaps used in `bwa aln`. Default is set to `bwa` default.\n\n> Modifies BWA aln parameter: `-o`\n"
+ },
"circularextension": {
"type": "integer",
"default": 500,
@@ -609,6 +616,7 @@
},
"bt2n": {
"type": "integer",
+ "default": 0,
"description": "Specify the -N parameter for bowtie2 (mismatches in seed). This will override defaults from alignmode/sensitivity.",
"fa_icon": "fas fa-sort-numeric-down",
"help_text": "The number of mismatches allowed in the seed during seed-and-extend procedure of Bowtie2. This will override any values set with `--bt2_sensitivity`. Can either be 0 or 1. Default: 0 (i.e. use`--bt2_sensitivity` defaults).\n\n> Modifies Bowtie2 parameters: `-N`",
@@ -619,18 +627,21 @@
},
"bt2l": {
"type": "integer",
+ "default": 0,
"description": "Specify the -L parameter for bowtie2 (length of seed substrings). This will override defaults from alignmode/sensitivity.",
"fa_icon": "fas fa-ruler-horizontal",
"help_text": "The length of the seed sub-string to use during seeding. This will override any values set with `--bt2_sensitivity`. Default: 0 (i.e. use`--bt2_sensitivity` defaults: [20 for local and 22 for end-to-end](http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#command-line).\n\n> Modifies Bowtie2 parameters: `-L`"
},
"bt2_trim5": {
"type": "integer",
+ "default": 0,
"description": "Specify number of bases to trim off from 5' (left) end of read before alignment.",
"fa_icon": "fas fa-cut",
"help_text": "Number of bases to trim at the 5' (left) end of read prior alignment. Maybe useful when left-over sequencing artefacts of in-line barcodes present Default: 0\n\n> Modifies Bowtie2 parameters: `-bt2_trim5`"
},
"bt2_trim3": {
"type": "integer",
+ "default": 0,
"description": "Specify number of bases to trim off from 3' (right) end of read before alignment.",
"fa_icon": "fas fa-cut",
"help_text": "Number of bases to trim at the 3' (right) end of read prior alignment. Maybe useful when left-over sequencing artefacts of in-line barcodes present Default: 0.\n\n> Modifies Bowtie2 parameters: `-bt2_trim3`"
@@ -688,12 +699,14 @@
},
"bam_mapping_quality_threshold": {
"type": "integer",
+ "default": 0,
"description": "Minimum mapping quality for reads filter.",
"fa_icon": "fas fa-greater-than-equal",
"help_text": "Specify a mapping quality threshold for mapped reads to be kept for downstream analysis. By default keeps all reads and is therefore set to `0` (basically doesn't filter anything).\n\n> Modifies samtools view parameter: `-q`"
},
"bam_filter_minreadlength": {
"type": "integer",
+ "default": 0,
"fa_icon": "fas fa-ruler-horizontal",
"description": "Specify minimum read length to be kept after mapping.",
"help_text": "Specify minimum length of mapped reads. This filtering will apply at the same time as mapping quality filtering.\n\nIf used _instead_ of minimum length read filtering at AdapterRemoval, this can be useful to get more realistic endogenous DNA percentages, when most of your reads are very short (e.g. in single-stranded libraries) and would otherwise be discarded by AdapterRemoval (thus making an artificially small denominator for a typical endogenous DNA calculation). Note in this context you should not perform mapping quality filtering nor discarding of unmapped reads to ensure a correct denominator of all reads, for the endogenous DNA calculation.\n\n> Modifies filter_bam_fragment_length.py parameter: `-l`"
@@ -1058,6 +1071,7 @@
},
"freebayes_g": {
"type": "integer",
+ "default": 0,
"description": "Specify to skip over regions of high depth by discarding alignments overlapping positions where total read depth is greater than specified in --freebayes_C.",
"fa_icon": "fab fa-think-peaks",
"help_text": "Specify to skip over regions of high depth by discarding alignments overlapping positions where total read depth is greater than specified C. Not set by default.\n\n> Modifies freebayes parameter: `-g`"