Skip to content

Commit 283b7fe

Browse files
authored
Merge pull request #175 from nf-core/docs-profile-update
Profile description improvement
2 parents 4bf820b + 889f8ef commit 283b7fe

3 files changed

Lines changed: 42 additions & 26 deletions

File tree

docs/configuration/adding_your_own.md

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,22 @@
11
# nf-core/eager: Configuration for other clusters
22

3-
It is entirely possible to run this pipeline on other clusters, though you will need to set up your own config file so that the pipeline knows how to work with your cluster.
3+
## Introduction
44

5-
> If you think that there are other people using the pipeline who would benefit from your configuration (eg. other common cluster setups), please let us know. We can add a new configuration and profile which can used by specifying `-profile <name>` when running the pipeline.
5+
It is entirely possible to run this pipeline on your own clusters, though you will need to set up your own config file so that the pipeline knows how to work with your cluster.
6+
7+
### Personal Profiles
68

79
If you are the only person to be running this pipeline, you can create your config file as `~/.nextflow/config` and it will be applied every time you run Nextflow. Alternatively, save the file anywhere and reference it when running the pipeline with `-c path/to/config` (see the [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html) for more).
810

911
A basic configuration comes with the pipeline, which runs by default (the `standard` config profile - see [`conf/base.config`](../conf/base.config)). This means that you only need to configure the specifics for your system and overwrite any defaults that you want to change.
1012

11-
## Cluster Environment
13+
### Institute Profiles
14+
15+
In contrast, if you think that there are other people using the pipeline who would benefit from your configuration (e.g. other common cluster setups), you can create a config adapted to that cluster and is centrally stored and maintained at [nf-core/configs](https://github.com/nf-core/configs). Then, you can specify `-profile <institute_name>` when running the pipeline without making your own custom config file. Furthermore, the same profile can be used for other nf-core pipelines.
16+
17+
## Creating your own profile
18+
19+
### Cluster Environment
1220
By default, pipeline uses the `local` Nextflow executor - in other words, all jobs are run in the login session. If you're using a simple server, this may be fine. If you're using a compute cluster, this is bad as all jobs will run on the head node.
1321

1422
To specify your cluster environment, add the following line to your config file:
@@ -27,11 +35,11 @@ process {
2735
clusterOptions = '-A myproject'
2836
}
2937
```
30-
## Software Requirements
38+
### Software Requirements
3139
To run the pipeline, several software packages are required. How you satisfy these requirements is essentially up to you and depends on your system. If possible, we _highly_ recommend using either Docker or Singularity.
3240
Please see the [`installation documentation`](../installation.md) for how to run using the below as a one-off. These instructions are about configuring a config file for repeated use.
3341

34-
### Docker
42+
#### Docker
3543
Docker is a great way to run nf-core/eager, as it manages all software installations and allows the pipeline to be run in an identical software environment across a range of systems.
3644

3745
Nextflow has [excellent integration](https://www.nextflow.io/docs/latest/docker.html) with Docker, and beyond installing the two tools, not much else is required - nextflow will automatically fetch the [nfcore/eager](https://hub.docker.com/r/nfcore/eager/) image that we have created and is hosted at dockerhub at run time.
@@ -46,7 +54,7 @@ process.container = "nfcore/eager"
4654
Note that the dockerhub organisation name annoyingly can't have a hyphen, so is `nfcore` and not `nf-core`.
4755

4856

49-
### Singularity image
57+
#### Singularity image
5058
Many HPC environments are not able to run Docker due to security issues.
5159
[Singularity](http://singularity.lbl.gov/) is a tool designed to run on such HPC systems which is very similar to Docker.
5260

@@ -75,15 +83,15 @@ process.container = "/path/to/nf-core-eager.simg"
7583
By default nextflow will store a singularity image in the working directory of a job. You can alternatively further specify a 'central' singularity cache to keep all singularity contains for a(ll) user(s). This can be
7684
done by either setting a central environmental variable `NXF_SINGULARITY_CACHEDIR` or specifying the location in a nextflow config file with `singularity.cacheDir`.
7785

78-
### Conda
86+
#### Conda
7987
If you're not able to use Docker or Singularity, you can instead use conda to manage the software requirements.
8088
To use conda in your own config file, add the following:
8189

8290
```nextflow
8391
process.conda = "$baseDir/environment.yml"
8492
```
8593

86-
## Software Caches
94+
### Software Caches
8795

8896
Each new version of a pipeline downloaded and ran, will pull down a new image (docker/singularity)/collection (conda) of all the software required for the pipeline. By default this will be placed in the `work/` directory of an EAGER run. When running lots of pipeline jobs, this can slow down the pipeline (having to create a download a new environment each time) and take up a lot of hard-disk space (as each run has it's own duplicate of the environment).
8997

@@ -104,7 +112,8 @@ conda {
104112
}
105113
```
106114

107-
## Job Resources
115+
### Job Resources
116+
108117
#### Automatic resubmission
109118
Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with an error code of `143` (exceeded requested resources) it will automatically resubmit with higher requests (2 x original, then 3 x original). If it still fails after three times then the pipeline is stopped.
110119

docs/installation.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,7 @@ To start using the nf-core/eager pipeline, follow the steps below:
1010
3. [Pipeline configuration](#3-pipeline-configuration)
1111
* [Software deps: Docker and Singularity](#31-software-deps-docker-and-singularity)
1212
* [Software deps: Bioconda](#32-software-deps-bioconda)
13-
* [Configuration profiles](#33-configuration-profiles)
14-
4. [Reference genomes](#4-reference-genomes)
13+
4. [Terminal configuration](#4-terminal-configuration)
1514
5. [Appendices](#appendices)
1615
* [Running on UPPMAX](#running-on-uppmax)
1716

@@ -34,10 +33,10 @@ See [nextflow.io](https://www.nextflow.io/) for further instructions on how to i
3433

3534
## 2) Install the pipeline
3635

37-
#### 2.1) Automatic
36+
### 2.1) Automatic
3837
This pipeline itself needs no installation - NextFlow will automatically fetch it from GitHub if `nf-core/eager` is specified as the pipeline name.
3938

40-
#### 2.2) Offline
39+
### 2.2) Offline
4140
The above method requires an internet connection so that Nextflow can download the pipeline files. If you're running on a system that has no internet connection, you'll need to download and transfer the pipeline files manually:
4241

4342
```bash
@@ -54,7 +53,7 @@ To stop nextflow from looking for updates online, you can tell it to run in offl
5453
export NXF_OFFLINE='TRUE'
5554
```
5655

57-
#### 2.3) Development
56+
### 2.3) Development
5857

5958
If you would like to make changes to the pipeline, it's best to make a fork on GitHub and then clone the files. Once cloned you can run the pipeline directly as above.
6059

@@ -81,13 +80,15 @@ The following software is currently required to be installed:
8180
* [GATK](https://software.broadinstitute.org/gatk/)
8281
* [bamUtil](https://genome.sph.umich.edu/wiki/BamUtil)
8382
* [fastP](https://github.com/OpenGene/fastp)
83+
* [DamageProfiler](https://github.com/Integrative-Transcriptomics/DamageProfiler)
8484

85-
#### 3.1) Software deps: Docker
85+
86+
### 3.1) Software deps: Docker
8687
First, install docker on your system: [Docker Installation Instructions](https://docs.docker.com/engine/installation/)
8788

8889
Then, running the pipeline with the option `-profile standard,docker` tells Nextflow to enable Docker for this run. An image containing all of the software requirements will be automatically fetched and used from dockerhub (https://hub.docker.com/r/nfcore/eager).
8990

90-
#### 3.1) Software deps: Singularity
91+
### 3.2) Software deps: Singularity
9192
If you're not able to use Docker then [Singularity](http://sylabs.io) is a great alternative.
9293
The process is very similar: running the pipeline with the option `-profile standard,singularity` tells Nextflow to enable singularity for this run. An image containing all of the software requirements will be automatically fetched and used from singularity hub.
9394

@@ -106,13 +107,13 @@ nextflow run /path/to/nf-core-eager -with-singularity nf-core-eager.simg
106107
Remember to pull updated versions of the singularity image if you update the pipeline.
107108

108109

109-
#### 3.2) Software deps: conda
110+
### 3.3) Software deps: conda
110111
If you're not able to use Docker _or_ Singularity, you can instead use conda to manage the software requirements.
111112
This is slower and less reproducible than the above, but is still better than having to install all requirements yourself!
112113
The pipeline ships with a conda environment file and nextflow has built-in support for this.
113114
To use it first ensure that you have conda installed (we recommend [miniconda](https://conda.io/miniconda.html)), then follow the same pattern as above and use the flag `-profile standard,conda`
114115

115-
#### 4) Profile configuration
116+
## 4) Terminal configuration
116117
Nextflow handles job submissions on SLURM or other environments, and supervises running the jobs. Thus the Nextflow process must run until the pipeline is finished. We recommend that you put the process running in the background through `screen` / `tmux` or similar tool. Alternatively you can run nextflow within a cluster job submitted your job scheduler.
117118

118119
It is recommended to limit the Nextflow Java virtual machines memory. We recommend adding the following line to your environment (typically in `~/.bashrc` or `~./bash_profile`):

docs/usage.md

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -29,16 +29,16 @@ screen -r eager2
2929
```
3030
to end the screen session while in it type `exit`.
3131

32-
It is recommended to limit the Nextflow Java virtual machines memory. We recommend adding the following line to your environment (typically in `~/.bashrc` or `~./bash_profile`):
3332

34-
```bash
35-
NXF_OPTS='-Xms1g -Xmx4g'
36-
```
3733
## Help Message
3834
To access the nextflow help message run: `nextflow run -help`
3935

4036
## Running the pipeline
37+
38+
> Before you start you should change into the output directory you wish your results to go in. When you start the nextflow job, it will place all the 'working' folders in the current directory and NOT necessarily the directory the output files will be in.
39+
4140
The typical command for running the pipeline is as follows:
41+
4242
```bash
4343
nextflow run nf-core/eager --reads '*_R{1,2}.fastq.gz' --fasta 'some.fasta' -profile standard,docker
4444
```
@@ -75,10 +75,14 @@ This version number will be logged in reports when you run the pipeline, so that
7575

7676
### `-profile`
7777

78-
Use this parameter to choose a configuration profile. Profiles can give configuration presets for different computing environments. Note that multiple profiles can be loaded, for example: `-profile standard,docker` - the order of arguments is important!
78+
Use this parameter to choose a configuration profile. Profiles can give configuration presets for different computing environments (e.g. schedulers, software environments, memory limits etc). Note that multiple profiles can be loaded, for example: `-profile standard,docker` - the order of arguments is important! The first entry takes precendence over the others, e.g. if a setting is set by both the first and second profile, the first entry will be used and the second entry ignored.
79+
80+
> *Important*: If running EAGER2 on a cluster - ask your system administrator what profile to use.
81+
82+
For more details on how to set up your own private profile, please see [installation](../configuration/adding_your_own.md).
7983

8084
**Basic profiles**
81-
These are basic profiles which primarily define where you derive the pipeline's software packages from. These are typically the profiles you would use if you are running the pipeline on your own PC (vs. a HPC cluster).
85+
These are basic profiles which primarily define where you derive the pipeline's software packages from. These are typically the profiles you would use if you are running the pipeline on your **own PC** (vs. a HPC cluster - see below).
8286

8387
* `standard`
8488
* The default profile, used if `-profile` is not specified at all.
@@ -99,9 +103,9 @@ These are basic profiles which primarily define where you derive the pipeline's
99103
* Includes links to test data so needs no other parameters
100104
* `none`
101105
* No configuration at all. Useful if you want to build your own config from scratch and want to avoid loading in the default `base` config profile (not recommended).
102-
106+
103107
**Institution Specific Profiles**
104-
These are profiles specific to certain clusters, and are centrally maintained at [nf-core/configs](`https://github.com/nf-core/configs`). Those listed below are regular users of EAGER2, if you don't see your own institution here check the [nf-core/configs](`https://github.com/nf-core/configs`) repository.
108+
These are profiles specific to certain **HPC clusters**, and are centrally maintained at [nf-core/configs](https://github.com/nf-core/configs). Those listed below are regular users of EAGER2, if you don't see your own institution here check the [nf-core/configs](https://github.com/nf-core/configs) repository.
105109

106110
* `uzh`
107111
* A profile for the University of Zurich Research Cloud
@@ -113,6 +117,8 @@ These are profiles specific to certain clusters, and are centrally maintained a
113117
* A profiler for the SDAG cluster at the Department of Archaeogenetics of the Max-Planck-Institute for the Science of Human History
114118
* Loads Singularity and defines appropriate resources for running the pipeline
115119

120+
121+
116122
### `--reads`
117123
Use this to specify the location of your input FastQ files. The files maybe either from a single, or multiple samples. For example:
118124

0 commit comments

Comments
 (0)