Skip to content
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- Removed leftover old DockerHub push CI commands.
- [#627](https://github.com/nf-core/eager/issues/627) Added de Barros Damgaard citation to README
- [#630](https://github.com/nf-core/eager/pull/630) Better handling of Qualimap memory requirements and error strategy.
- [#638](https://github.com/nf-core/eager/issues/638#issuecomment-748877567) Fixed inverted circularfilter filtering (previously filtering would happen by default, not when requested by user as originally recorded in documentation)

### `Dependencies`

Expand Down
228 changes: 0 additions & 228 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -391,203 +391,6 @@ hard drive footprint of the run, so be sure to do this!

## Troubleshooting and FAQs

### My pipeline update doesn't seem to do anything

To download a new version of a pipeline, you can use the following, replacing
`<VERSION>` to the corresponding version.

```bash
nextflow pull nf-core/eager -r <VERSION>
```

However, in very rare cases, minor fixes to a version will be pushed out without
a version number bump. This can confuse nextflow slightly, as it thinks you
already have the 'broken' version from your original pipeline download.

If when running the pipeline you don't see any changes in the fixed version when
running it, you can try removing your nextflow EAGER cache typically stored in
your home directory with

```bash
rm -r ~/.nextflow/assets/nf-core/eager
```

And re-pull the pipeline with the command above. This will install a fresh
version of the version with the fixes.

### Input files not found

When using the [direct input](#direct-input-method) method: if no file, only one
input file, or only 'read one' and not 'read two' is picked up then something is
likely wrong with your input file declaration ([`--input`](#--input)):

1. The path must be enclosed in quotes (`'` or `"`)
2. The path must have at least one `*` wildcard character. This is even if you
are only running one paired end sample.
3. When using the pipeline with paired end data, the path must use `{1,2}` or
`{R1,R2}` notation to specify read pairs.
4. If you are running single-end data make sure to specify `--single_end`

**Important**: The pipeline can't take a list of multiple input files when using
the direct input method - it takes a 'glob' expression. If your input files are
scattered in different paths then we recommend that you generate a directory
with symlinked files. If running in paired-end mode please make sure that your
files are sensibly named so that they can be properly paired. See the previous
point.

If the pipeline can't find your files then you will get the following error

```bash
ERROR ~ Cannot find any reads matching: *{1,2}.fastq.gz
```

If your sample name is "messy" then you have to be very particular with your
glob specification. A file name like `L1-1-D-2h_S1_L002_R1_001.fastq.gz` can be
difficult enough for a human to read. Specifying `*{1,2}*.gz` won't work give
you what you want whilst `*{R1,R2}*.gz` (i.e. the addition of the `R`s) will.

If using the [TSV input](#tsv-input-method) method, this likely means there is a
mistake or typo in the path in a given column. Often this is a trailing space at
the end of the path.

### I am only getting output for a single sample although I specified multiple with wildcards

You must specify paths to files in quotes, otherwise your shell will evaluate
any wildcards (\*) rather than Nextflow.

For example

```bash
nextflow run nf-core/eager --input /path/to/sample_*/*.fq.gz
```

Would be evaluated by your shell as

```bash
nextflow run nf-core/eager --input /path/to/sample_1/sample_1.fq.gz /path/to/sample_1/sample_1.fq.gz /path/to/sample_1/sample_1.fq.gz
```

And Nextflow will only take the first path after `--input`, ignoring the others.

On the other hand, encapsulating the path in quotes will allow Nextflow to
evaluate the paths.

```bash
nextflow run nf-core/eager --input "/path/to/sample_*/*.fq.gz"
```

### The pipeline crashes almost immediately with an early pipeline step

Sometimes a newly downloaded and set up nf-core/eager pipeline will encounter an
issue where a run almost immediately crashes (e.g. at `fastqc`,
`output_documentation` etc.) saying the tool could not be found or similar.

#### I am running Docker

You may have an outdated container. This happens more often when running on the
`dev` branch of nf-core/eager, because Docker will _not_ update the container on
each new commit, and thus may not get new tools called within the pipeline code.

To fix, just re-pull the nf-core/eager Docker container manually with:

```bash
docker pull nfcore/eager:dev
```

#### I am running Singularity

If you're running Singularity, it could be that Nextflow cannot access your
Singularity image properly - often due to missing bind paths.

See
[here](https://nf-co.re/usage/troubleshooting#cannot-find-input-files-when-using-singularity)
for more information.

### The pipeline has crashed with an error but Nextflow is still running

If this happens, you can either wait until all other already running jobs to
safely finish, or if Nextflow _still_ does not stop press `ctrl + c` on your
keyboard (or equivalent) to stop the Nextflow run.

> :warning: if you do this, and do not plan to fix the run make sure to delete
the output folder. Otherwise you may end up a lot of large intermediate files
being left! You can clean a Nextflow run of all intermediate files with
`nextflow clean -f -k` or delete the `work/` directory.

### I get a exceeded job memory limit error

While Nextflow tries to make your life easier by automatically retrying jobs
that run out of memory with more resources (until your specified max-limit),
sometimes you may have such large data you run out even after the default 3
retries.

To fix this you need to change the default memory requirements for the process
that is breaking. We can do this by making a custom profile, which we then
provide to the Nextflow run command.

For example, lets say it's the `markduplicates` process that is running out of
memory.

First we need to check to see what default memory value we have. We can do this
by going to the main [nf-core/eager code](https://github.com/nf-core/) and
opening the `main.nf` file. We can then use your browser's find functionality
for: `process markduplicates`.

Once found, we then need to check the line called `label`. In this case the
label is `mc_small` (for multi-core small).

Next we need to go back to the main github repository, and open
`conf/base.config`. Again using our find functionality, we search for:
`withLabel:'mc_small'`.

We see that the `memory` is set to `4.GB` (`memory = { check_max( 4.GB *
task.attempt, 'memory' )})`).

Now back on your computer, we need to make a new file called
`custom_resources.conf`. You should save it somewhere centrally so you can
reuse it.

> If you think this would be useful for multiple people in your lab/institute,
> we highly recommend you make an institutional profile at
> [nf-core/configs](https://github.com/nf-core/configs). This will simplify this
> process in the future.

Within this file, you will need to add the following:

```txt
profiles {
big_data {
process {
withName: markduplicates {
memory = 16.GB
}
}
}
}
```

Where we have increased the default `4.GB` to `16.GB`. Make sure that you keep
the `check_max` function, as this prevents your run asking for too much memory
during retries.

> Note that with this you will _not_ have the automatic retry mechanism. If
> you want this, re-add the `check_max()` function on the `memory` line, and
> add to the bottom of the entire file (outside the profiles block), the
> block starting `def check_max(obj, type) {`, which is at the end of the
> [nextflow.config file](https://github.com/nf-core/eager/blob/master/nextflow.config)

Once saved, we can then modify your original Nextflow run command:

```bash
nextflow run nf-core/eager -r 2.2.0 -c /<path>/<to>/custom_resources.conf -profile big_data,<original>,<profiles> <...>
```

Where we have added `-c` to specify which file to use for the custom profiles,
and then added the `big_data` profile to the original profiles you were using.

:warning: it's important that big_data comes first, to ensure it overwrites any
parameters set in the subsequent profiles!

### I get a file name collision error during merging

When using TSV input, nf-core/eager will attempt to merge all `Lanes` of a
Expand All @@ -608,37 +411,6 @@ they are unique (e.g. if one library was sequenced on Lane 8 of two HiSeq runs,
specify lanes as 8 and 16 for each FASTQ file respectively). For library merging
errors, you must modify your `Library_ID`s accordingly, to make them unique.

### I specified a module and it didn't produce the expected output

Possible options:

1. Check there if you have a typo in the parameter name. Nextflow _does not_
check for this
2. Check that an upstream module was turned on (if a module requires the output
of a previous module, it will not be activated unless it receives the output)

### I get a unable to acquire lock

Errors like the following

```bash
Unable to acquire lock on session with ID 84333844-66e3-4846-a664-b446d070f775
```

normally suggest a previous Nextflow run (on the same folder) was not cleanly
killed by a user (e.g. using ctrl + z to hard kill a crashed run).

To fix this, you must clean the entirety of the output directory (including
output files) e.g. with `rm -r <output_dir>/* <output_dir>/.*` and re-running
from scratch.

`ctrl +z` is **not** a recommended way of killing a Nextflow job. Runs that take
a long time to fail are often still running because other job submissions are
still running. Nextflow will normally wait for those processes to complete
before cleaning shutting down the run (to allow rerunning of a run with
`-resume`). `ctrl + c` is much safer as it will tell Nextflow to stop earlier
but cleanly.

## Tutorials

### Tutorial - How to investigate a failed run
Expand Down
5 changes: 3 additions & 2 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ def helpMessage() {
--bwaalnl [num] Specify the -l parameter for BWA aln, i.e. length of seeds to be used. Set to 1024 for whole read. Default: ${params.bwaalnl}
--circularextension [num] Specify the number of bases to extend reference by (circularmapper only). Default: ${params.circularextension}
--circulartarget [chr] Specify the FASTA header of the target chromosome to extend(circularmapper only). Default: '${params.circulartarget}'
--circularfilter [bool] Turn on to filter off-target reads (circularmapper only).
--circularfilter [bool] Turn on to remove reads that did not map to the circularised genome (circularmapper only).
--bt2_alignmode [str] Specify the bowtie2 alignment mode. Options: 'local', 'end-to-end'. Default: '${params.bt2_alignmode}'
--bt2_sensitivity [str] Specify the level of sensitivity for the bowtie2 alignment mode. Options: 'no-preset', 'very-fast', 'fast', 'sensitive', 'very-sensitive'. Default: '${params.bt2_sensitivity}'
--bt2n [num] Specify the -N parameter for bowtie2 (mismatches in seed). This will override defaults from alignmode/sensitivity. Default: ${params.bt2n}
Expand Down Expand Up @@ -1541,6 +1541,7 @@ process circulargenerator{
else null
}


input:
file fasta from ch_fasta_for_circulargenerator

Expand Down Expand Up @@ -1578,7 +1579,7 @@ process circularmapper{
params.mapper == 'circularmapper'

script:
def filter = params.circularfilter ? '' : '-f true -x false'
def filter = params.circularfilter ? '-f true -x false' : ''
Comment thread
jfy133 marked this conversation as resolved.
Outdated
def elongated_root = "${fasta.baseName}_${params.circularextension}.fasta"
def size = params.large_ref ? '-c' : ''

Expand Down
4 changes: 2 additions & 2 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -555,7 +555,7 @@
},
"circularfilter": {
"type": "boolean",
"description": "Turn on to filter off-target reads (circularmapper only).",
"description": "Turn on to remove reads that did not map to the circularised genome (circularmapper only).",
"fa_icon": "fas fa-filter",
"help_text": "If you want to filter out reads that don't map to a circular chromosome, turn this on. By default this option is turned off.\n"
},
Expand Down Expand Up @@ -1575,4 +1575,4 @@
"$ref": "#/definitions/metagenomic_authentication"
}
]
}
}