You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*[AWS batch specific parameters](#aws-batch-specific-parameters)
16
-
*[Other command line parameters](#other-command-line-parameters)
17
-
*[Adjustable parameters for nf-core/eager](#adjustable-parameters-for-nf-coreeager)
13
+
18
14
19
15
## General Nextflow info
20
16
Nextflow handles job submissions on SLURM or other environments, and supervises running the jobs. Thus the Nextflow process must run until the pipeline is finished. We recommend that you put the process running in the background through `screen` / `tmux` or similar tool. Alternatively you can run nextflow within a cluster job submitted your job scheduler.
21
17
22
-
It is recommended to limit the Nextflow Java virtual machines memory. We recommend adding the following line to your environment (typically in `~/.bashrc` or `~./bash_profile`):
18
+
To create a screen session:
23
19
24
20
```bash
25
-
NXF_OPTS='-Xms1g -Xmx4g'
21
+
screen -R eager2
22
+
```
23
+
To disconnect, press `ctrl+a` then `d`.
24
+
25
+
To reconnect, type :
26
+
27
+
```bash
28
+
screen -r eager2
26
29
```
30
+
to end the screen session while in it type `exit`.
27
31
32
+
It is recommended to limit the Nextflow Java virtual machines memory. We recommend adding the following line to your environment (typically in `~/.bashrc` or `~./bash_profile`):
28
33
29
-
## Preamble
34
+
```bash
35
+
NXF_OPTS='-Xms1g -Xmx4g'
36
+
```
37
+
## Help Message
30
38
To access the nextflow help message run: `nextflow run -help`
31
39
32
40
## Running the pipeline
33
41
The typical command for running the pipeline is as follows:
34
42
```bash
35
43
nextflow run nf-core/eager --reads '*_R{1,2}.fastq.gz' --fasta 'some.fasta' -profile standard,docker
36
44
```
37
-
38
-
> Note, that you might need to use `-profile standard,singularity` if you installed Singularity and don't want to use Docker. Also make sure, that you specify how much memory is available on your machine by using the `--max_cpus`, `--max_memory` options.
45
+
where the reads are from libraries of the same pairing.
39
46
40
47
This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles.
41
48
@@ -64,45 +71,61 @@ First, go to the [nf-core/eager releases page](https://github.com/nf-core/eager/
64
71
65
72
This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future.
66
73
67
-
68
74
## Mandatory Arguments
69
75
70
76
### `-profile`
71
77
72
-
Use this parameter to choose a configuration profile. Profiles can give configuration presets for different compute environments. Note that multiple profiles can be loaded, for example: `-profile standard,docker` - the order of arguments is important!
78
+
Use this parameter to choose a configuration profile. Profiles can give configuration presets for different computing environments. Note that multiple profiles can be loaded, for example: `-profile standard,docker` - the order of arguments is important!
79
+
80
+
**Basic profiles**
81
+
These are basic profiles which primarily define where you derive the pipeline's software packages from. These are typically the profiles you would use if you are running the pipeline on your own PC (vs. a HPC cluster).
73
82
74
83
*`standard`
75
84
* The default profile, used if `-profile` is not specified at all.
76
85
* Runs locally and expects all software to be installed and available on the `PATH`.
77
-
*`uzh`
78
-
* A profile for the University of Zurich Research Cloud
79
-
* Loads Singularity and defines appropriate resources for running the pipeline.
80
86
*`docker`
81
87
* A generic configuration profile to be used with [Docker](http://docker.com/)
82
88
* Pulls software from dockerhub: [`nfcore/eager`](http://hub.docker.com/r/nfcore/eager/)
83
89
*`singularity`
84
90
* A generic configuration profile to be used with [Singularity](http://singularity.lbl.gov/)
85
91
* Pulls software from singularity-hub
86
-
*`binac`
87
-
* A profile for the BinAC cluster at the University of Tuebingen
88
-
* Loads Singularity and defines appropriate resources for running the pipeline
89
92
*`conda`
90
93
* A generic configuration profile to be used with [conda](https://conda.io/docs/)
91
94
* Pulls most software from [Bioconda](https://bioconda.github.io/)
92
-
*`awsbatch`
95
+
*`awsbatch`
93
96
* A generic configuration profile to be used with AWS Batch.
94
97
*`test`
95
98
* A profile with a complete configuration for automated testing
96
99
* Includes links to test data so needs no other parameters
97
100
*`none`
98
101
* No configuration at all. Useful if you want to build your own config from scratch and want to avoid loading in the default `base` config profile (not recommended).
102
+
103
+
**Institution Specific Profiles**
104
+
These are profiles specific to certain clusters, and are centrally maintained at [nf-core/configs](`https://github.com/nf-core/configs`). Those listed below are regular users of EAGER2, if you don't see your own institution here check the [nf-core/configs](`https://github.com/nf-core/configs`) repository.
105
+
106
+
*`uzh`
107
+
* A profile for the University of Zurich Research Cloud
108
+
* Loads Singularity and defines appropriate resources for running the pipeline.
109
+
*`binac`
110
+
* A profile for the BinAC cluster at the University of Tuebingen
111
+
* Loads Singularity and defines appropriate resources for running the pipeline
112
+
*`shh`
113
+
* A profiler for the SDAG cluster at the Department of Archaeogenetics of the Max-Planck-Institute for the Science of Human History
114
+
* Loads Singularity and defines appropriate resources for running the pipeline
99
115
100
116
### `--reads`
101
-
Use this to specify the location of your input FastQ files. For example:
117
+
Use this to specify the location of your input FastQ files. The files maybe either from a single, or multiple samples. For example:
102
118
103
119
```bash
104
120
--reads 'path/to/data/sample_*_{1,2}.fastq'
105
121
```
122
+
for a single sample, or
123
+
124
+
```bash
125
+
--reads 'path/to/data/*/sample_*_{1,2}.fastq'
126
+
```
127
+
128
+
for multiple samples, where each sample's FASTQs are in it's own directory (indicated by the first `*`).
106
129
107
130
Please note the following requirements:
108
131
@@ -112,14 +135,23 @@ Please note the following requirements:
112
135
113
136
If left unspecified, a default pattern is used: `data/*{1,2}.fastq.gz`
114
137
138
+
**Note**: It is not possible to run a mixture of single-end and paired-end files in one run.
139
+
115
140
### `--singleEnd`
116
141
If you have single-end data, you need to specify `--singleEnd` on the command line when you launch the pipeline. A normal glob pattern, enclosed in quotation marks, can then be used for `--reads`. For example:
117
142
118
143
```bash
119
-
--singleEnd --reads '*.fastq'
144
+
--singleEnd --reads 'path/to/data/*.fastq'
120
145
```
146
+
for a single sample, or
121
147
122
-
It is not possible to run a mixture of single-end and paired-end files in one run.
148
+
```bash
149
+
--singleEnd --reads 'path/to/data/*/*.fastq'
150
+
```
151
+
152
+
for multiple samples, where each sample's FASTQs are in it's own directory (indicated by the first `*`)
153
+
154
+
**Note**: It is not possible to run a mixture of single-end and paired-end files in one run.
123
155
124
156
### `--pairedEnd`
125
157
If you have paired-end data, you need to specify `--pairedEnd` on the command line when you launc hthe pipeline.
@@ -196,8 +228,20 @@ If you turn this on, the generated indices will be stored in the `./results/refe
196
228
### `--outdir`
197
229
The output directory where the results will be saved.
198
230
231
+
### `--max_memory`
232
+
Use to set a top-limit for the default memory requirement for each process.
233
+
Should be a string in the format integer-unit. eg. `--max_memory '8.GB'`. If not specified, will be taken from the configuration in the `-profile` flag.
234
+
235
+
### `--max_time`
236
+
Use to set a top-limit for the default time requirement for each process.
237
+
Should be a string in the format integer-unit. eg. `--max_time '2.h'`. If not specified, will be taken from the configuration in the `-profile` flag.
238
+
239
+
### `--max_cpus`
240
+
Use to set a top-limit for the default CPU requirement for each process.
241
+
Should be a string in the format integer-unit. eg. `--max_cpus 1`. If not specified, will be taken from the configuration in the `-profile` flag.
242
+
199
243
### `--email`
200
-
Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to speicfy this on the command line for every run.
244
+
Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run.
201
245
202
246
### `-name`
203
247
Name for the pipeline run. If not specified, Nextflow will automatically generate a random mnemonic.
@@ -214,7 +258,7 @@ You can also supply a run name to resume a specific run: `-resume [run-name]`. U
214
258
**NB:** Single hyphen (core Nextflow option)
215
259
216
260
### `-c`
217
-
Specify the path to a specific config file (this is a core NextFlow command).
261
+
Specify the path to a specific nextflow config file (this is a core NextFlow command).
218
262
219
263
**NB:** Single hyphen (core Nextflow option)
220
264
@@ -223,47 +267,15 @@ Note - you can use this to override defaults. For example, you can specify a con
223
267
```nextflow
224
268
process.$multiqc.module = []
225
269
```
226
-
227
-
### `--max_memory`
228
-
Use to set a top-limit for the default memory requirement for each process.
229
-
Should be a string in the format integer-unit. eg. `--max_memory '8.GB'`
230
-
231
-
### `--max_time`
232
-
Use to set a top-limit for the default time requirement for each process.
233
-
Should be a string in the format integer-unit. eg. `--max_time '2.h'`
234
-
235
-
### `--max_cpus`
236
-
Use to set a top-limit for the default CPU requirement for each process.
237
-
Should be a string in the format integer-unit. eg. `--max_cpus 1`
238
-
239
270
### `--plaintext_email`
240
271
Set to receive plain-text e-mails instead of HTML formatted.
241
272
242
273
### `--multiqc_config`
243
-
Specify a path to a custom MultiQC configuration file.
244
-
274
+
Specify a path to a custom MultiQC configuration file. MultiQC produces final pipeline reports.
245
275
246
276
# Adjustable parameters for nf-core/eager
247
277
248
-
This part of the readme contains a list of user-adjustable parameters in nf-core/eager. You can specify any of these parameters on the command line when calling the pipeline by simply prefixing the respective parameter with a double dash `--`.
249
-
250
-
Example:
251
-
```
252
-
nextflow run nf-core/eager -r 2.0 -profile standard,docker --singleEnd [...]
253
-
```
254
-
This would run the pipeline in single end mode, thus assuming that all entered `FastQ` files are sequenced following a single end sequencing protocol.
255
-
256
-
## General Pipeline Parameters
257
-
258
-
These parameters are required in some cases, e.g. when performing in-solution SNP capture protocols (390K,1240K, ...) for population genetics for example. Make sure to specify the required parameters in such cases.
259
-
260
-
### `--snpcapture` false
261
-
262
-
This is by default set to `false`, but can be turned on to calculate on target metrics automatically for you. Note, that this requires setting `--bedfile` with the target SNPs simultaneously.
263
-
264
-
### `--bedfile`
265
-
266
-
Can be used to set a path to a BED file (3/6 column format) to calculate capture target efficiency on the fly. Will not be used without `--bedfile` set as parameter.
278
+
This part of the documentation contains a list of user-adjustable parameters in nf-core/eager. You can specify any of these parameters on the command line when calling the pipeline by simply prefixing the respective parameter with a double dash `--`
267
279
268
280
## Step skipping parameters
269
281
@@ -285,15 +297,15 @@ Turns off QualiMap and thus does not compute coverage and other mapping metrics.
285
297
286
298
Turns off duplicate removal methods DeDup and MarkDuplicates respectively. No duplicates will be removed on any data in the pipeline.
287
299
288
-
### `--complexity_filter`
300
+
##Complexity Filtering Options
289
301
290
-
Performs a poly-G complexity filtering step in the beginning of the pipeline if turne on. This can be useful for especially assembly projects where low-complexity regions might dramatically influence the assembly of contigs.
302
+
### `--complexity_filter`
291
303
292
-
## Complexity Filtering Options
304
+
Performs a poly-G tail removal step in the beginning of the pipeline, if turned on. This can be useful for trimming ploy-G tails from short-fragments sequenced on two-colour Illumina chemistry such as NextSeqs (where no-fluorescence is read as a G on two-colour chemistry), which can inflate reported GC content values.
293
305
294
306
### `--complexity_filter_poly_g_min`
295
307
296
-
This option can be used to define the minimum value for the poly-G filtering step in low complexity filtering. By default, this is set to a value of `10` unless the user has chosen something specifically using this option.
308
+
This option can be used to define the minimum length of a poly-G tail to begin low complexity trimming. By default, this is set to a value of `10` unless the user has chosen something specifically using this option.
297
309
298
310
## Adapter Clipping and Merging Options
299
311
@@ -404,7 +416,6 @@ Specifies the length of the read start and end to be considered for profile gene
404
416
405
417
Specifies to run PMDTools for damage based read filtering and assessment of DNA damage in sequencing libraries. By default turned off.
406
418
407
-
408
419
### `--udg` false
409
420
410
421
Defines whether Uracil-DNA glycosylase (UDG) treatment was used to repair DNA damage on the sequencing libraries. If set, the parameter is used by downstream tools such as PMDTools to estimate damage only on CpG sites that are left after such a treatment.
@@ -444,3 +455,18 @@ Default set to `1` and clipps off one base of the left or right side of reads. N
444
455
### `--bamutils_softclip`
445
456
446
457
By default, nf-core/eager uses hard clipping and sets clipped bases to `N` with quality `!` in the BAM output. Turn this on to use soft-clipping instead, masking reads at the read ends respectively using the CIGAR string.
458
+
459
+
## Library-Type Parameters
460
+
461
+
These parameters are required in some cases, e.g. when performing in-solution SNP capture protocols (390K,1240K, ...) for population genetics for example. Make sure to specify the required parameters in such cases.
462
+
463
+
### `--snpcapture` false
464
+
465
+
This is by default set to `false`, but can be turned on to calculate on target metrics automatically for you. Note, that this requires setting `--bedfile` with the target SNPs simultaneously.
466
+
467
+
### `--bedfile`
468
+
469
+
Can be used to set a path to a BED file (3/6 column format) to calculate capture target efficiency on the fly. Will not be used without `--bedfile` set as parameter.
470
+
471
+
## Automatic Resubmission
472
+
By default, if a pipeline step fails, EAGER2 will resubmit the job with twice the amount of CPU and memory. This will occur two times before failing.
0 commit comments