Skip to content

Commit 88f47fe

Browse files
kisarurjfy133georgiesamahanf-core-bot
authored
Update NCI Gadi config and documentation (#1096)
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com> Co-authored-by: georgiesamaha <georgiesamaha@gmail.com> Co-authored-by: Georgie Samaha <73086054+georgiesamaha@users.noreply.github.com> Co-authored-by: nf-core-bot <core@nf-co.re>
1 parent 5f5f48a commit 88f47fe

3 files changed

Lines changed: 57 additions & 46 deletions

File tree

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
**/unsw_katana** @jscgh
88
**/seadragon** @jiawku
99
**/fred_hutch** @derrik-gratz
10+
**/nci_gadi** @georgiesamaha @kisarur @mattdton
1011
**/roslin** @sguizard @donalddunbar
1112
**/lrz_cm4** @nschan
1213
**/crg** @joseespinosa

conf/nci_gadi.config

Lines changed: 8 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,33 @@
11
// NCI Gadi nf-core configuration profile
22
params {
33
config_profile_description = 'NCI Gadi HPC profile provided by nf-core/configs'
4-
config_profile_contact = 'Georgie Samaha (@georgiesamaha), Matthew Downton (@mattdton)'
4+
config_profile_contact = 'Georgie Samaha (@georgiesamaha), Kisaru Liyanage (@kisarur), Matthew Downton (@mattdton)'
55
config_profile_url = 'https://opus.nci.org.au/display/Help/Gadi+User+Guide'
6-
project = System.getenv("PROJECT")
6+
nci_gadi_project = System.getenv("PROJECT")
7+
nci_gadi_storage = "gdata/${params.nci_gadi_project}+scratch/${params.nci_gadi_project}"
78
}
89

10+
validation.ignoreParams = ["nci_gadi_project", "nci_gadi_storage"]
11+
912
// Enable use of Singularity to run containers
1013
singularity {
1114
enabled = true
1215
autoMounts = true
16+
cacheDir = "/scratch/${params.nci_gadi_project}/${System.getenv('USER')}/nxf_singularity_cache"
1317
}
1418

1519
// Submit up to 300 concurrent jobs (Gadi exec max)
16-
// pollInterval and queueStatInterval of every 5 minutes
17-
// submitRateLimit of 20 per minute
1820
executor {
1921
queueSize = 300
20-
pollInterval = '5 min'
21-
queueStatInterval = '5 min'
22-
submitRateLimit = '20 min'
2322
}
2423

2524
// Define process resource limits
2625
process {
2726
executor = 'pbspro'
28-
storage = "scratch/${params.project}"
27+
project = "${params.nci_gadi_project}" // The version of Nextflow installed on Gadi has been modified to allow usage of this non standard directive
28+
storage = "${params.nci_gadi_storage}" // The version of Nextflow installed on Gadi has been modified to allow usage of this non standard directive
2929
module = 'singularity'
3030
cache = 'lenient'
3131
stageInMode = 'symlink'
3232
queue = { task.memory < 128.GB ? 'normalbw' : (task.memory >= 128.GB && task.memory <= 190.GB ? 'normal' : (task.memory > 190.GB && task.memory <= 1020.GB ? 'hugemembw' : '')) }
33-
beforeScript = 'module load singularity'
34-
}
35-
36-
// Write custom trace file with outputs required for SU calculation
37-
def trace_timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss')
38-
trace {
39-
enabled = true
40-
overwrite = false
41-
file = "./gadi-nf-core-trace-${trace_timestamp}.txt"
42-
fields = 'name,status,exit,duration,realtime,cpus,%cpu,memory,%mem,rss'
4333
}

docs/nci_gadi.md

Lines changed: 48 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -2,71 +2,91 @@
22

33
nf-core pipelines have been successfully configured for use on the [Gadi HPC](https://opus.nci.org.au/display/Help/Gadi+User+Guide) at the National Computational Infrastructure (NCI), Canberra, Australia.
44

5-
To run an nf-core pipeline at NCI Gadi, run the pipeline with `-profile singularity,nci_gadi`. This will download and launch the [`nci_gadi.config`](../conf/nci_gadi.config) which has been pre-configured with a setup suitable for the NCI Gadi HPC cluster. Using this profile, a docker image containing all of the required software will be downloaded, and converted to a Singularity image before execution of the pipeline.
5+
To run an nf-core pipeline at NCI Gadi, run the pipeline with `-profile singularity,nci_gadi`. This will download and launch the [`nci_gadi.config`](https://github.com/nf-core/configs/blob/master/conf/nci_gadi.config) which has been pre-configured with a setup suitable for the NCI Gadi HPC cluster.
66

77
## Access to NCI Gadi
88

99
Please be aware that you will need to have a user account, be a member of an Gadi project, and have a service unit allocation to your project in order to use this infrastructure. See the [NCI user guide](https://opus.nci.org.au/display/Help/Getting+Started+at+NCI) for details on getting access to Gadi.
1010

1111
## Launch an nf-core pipeline on Gadi
1212

13-
### Prerequisites
14-
15-
Before running the pipeline you will need to load Nextflow and Singularity, both of which are globally installed modules on Gadi. You can do this by running the commands below:
13+
Before running the pipeline, you will need to load Nextflow and Singularity, both of which are globally installed modules on Gadi (under `/apps`). You can do this by running the commands below:
1614

1715
```bash
1816
module purge
19-
module load nextflow singularity
17+
module load nextflow
18+
module load singularity
2019
```
2120

22-
### Execution command
21+
:::warning
22+
The version of Nextflow installed on Gadi has been modified to make it easier to specify resource options for jobs submitted to the cluster through the Nextflow process block (see NCI's [Gadi user guide](https://opus.nci.org.au/display/DAE/Nextflow) for more details).
23+
:::
2324

24-
```bash
25-
module load nextflow
26-
module load singularity
25+
You can then run the pipeline using:
2726

27+
```bash
2828
nextflow run <nf-core_pipeline>/main.nf \
2929
-profile singularity,nci_gadi \
3030
<additional flags>
3131
```
3232

3333
### Cluster considerations
3434

35-
Please be aware that as of July 2023, NCI Gadi HPC queues **do not** have external network access. This means you will not be able to pull the workflow code base or containers if you submit your `nextflow run` command as a job on any of the standard job queues. NCI currently recommends you run your Nextflow head job either in a GNU screen or tmux session from the login node or submit it as a job to the [copyq](https://opus.nci.org.au/display/Help/Queue+Structure). See the [nf-core documentation](https://nf-co.re/docs/usage/offline) for instructions on running pipelines offline.
35+
#### External network access
36+
37+
Please be aware that NCI Gadi HPC compute nodes **do not** have external network access. This means you will not be able to pull the workflow codebase or containers if you submit your `nextflow run` command as a job on any of the standard job queues (see the [nf-core documentation](https://nf-co.re/docs/usage/offline) for instructions on running pipelines offline). NCI currently recommends you run your Nextflow head job either in a GNU screen or tmux session within a [persistent session](https://opus.nci.org.au/spaces/Help/pages/241926895/Persistent+Sessions), or submit it as a job to the [copyq](https://opus.nci.org.au/display/Help/Queue+Structure).
38+
39+
For example, to run Nextflow in a GNU screen session within a persistent session:
40+
41+
```bash
42+
persistent-sessions start -p <project> <ps_name>
43+
ssh <ps_name>.<user>.<project>.ps.gadi.nci.org.au
44+
screen -S <screen_name>
45+
nextflow run ...
46+
```
47+
48+
You can detach from the screen session using Ctrl+A, then D, and log out of the persistent session while the pipeline continues to run. Later, you can reconnect to the persistent session using the same `ssh` command and reattach to the screen session with: `screen -r <screen_name>`.
49+
50+
#### Downloading containers
51+
52+
This config requires Nextflow to use [Singularity](https://www.nextflow.io/docs/latest/container.html#singularity) to execute processes. Before any process can be executed, the nf-core pipeline will first download the required container image to a local cache. This cache location can be specified using either `$NXF_SINGULARITY_CACHEDIR` environment variable or the `singularity.cacheDir` setting in the Nextflow config file. `nci_gadi.config` specifies the download and storage location with:
53+
54+
```
55+
singularity.cacheDir = "/scratch/${params.nci_gadi_project}/${System.getenv('USER')}/nxf_singularity_cache"
56+
```
57+
58+
See the [project accounting](#project-accounting) section below for details on `params.nci_gadi_project`.
59+
60+
Furthermore, Singularity uses the `$SINGULARITY_CACHEDIR` directory to store intermediate image layers and files during pulls (note that this cache is only used when the required container is not already available in Nextflow's own Singularity cache, specified by `$NXF_SINGULARITY_CACHEDIR` or `singularity.cacheDir`). By default, `$SINGULARITY_CACHEDIR` is set to `$HOME/.singularity/cache`. For pipelines involving a large number and/or large size of first-time container downloads, we recommend setting this environment variable to a scratch location to avoid exceeding your home filesystem quota. For example, before running your nextflow run command, you can set the environment variable to a location in the scratch filesystem with:
61+
62+
```
63+
export SINGULARITY_CACHEDIR=/scratch/$PROJECT/$USER/singularity_cache
64+
```
65+
66+
#### Gadi queues and job submission
3667

3768
This config currently determines which Gadi queue to submit your task jobs to based on the amount of memory required. For the sake of resource and cost (service unit) efficiency, the following rules are applied by this config:
3869

3970
- Tasks requesting **less than 128 Gb** will be submitted to the normalbw queue
4071
- Tasks requesting **more than 128 Gb and less than 190 Gb** will be submitted to the normal queue
4172
- Tasks requesting **more than 190 Gb and less than 1020 Gb** will be submitted to the hugemembw queue
4273

43-
See the NCI Gadi [queue limit documentation](https://opus.nci.org.au/display/Help/Queue+Limits) for details on charge rates for each queue.
74+
Note that these are only baseline queue settings and may be adjusted depending on the goals of your pipeline run and the most efficient use of the HPC. You can make a local copy of the `nci_gadi.config` and modify the queue assignments as needed for specific processes or process groups. See the NCI Gadi [queue limit documentation](https://opus.nci.org.au/display/Help/Queue+Limits) for more information on the available queues and their associated charge rates.
4475

4576
### Project accounting
4677

47-
This config uses the PBS environmental variable `$PROJECT` to assign a project code to all task job submissions for billing purposes. If you are a member of multiple Gadi projects, you should confirm which project will be charged for your pipeline execution. You can do this using:
78+
This config uses `params.nci_gadi_project` to assign a project code to all task job submissions for billing purposes. By default, this is set to the environment variable `$PROJECT`. If you are a member of multiple Gadi projects, you can choose which project will be charged for your pipeline execution by setting `params.nci_gadi_project` (`--nci_gadi_project` on the command line) to the desired project code.
4879

49-
```bash
50-
echo $PROJECT
51-
```
80+
Similarly, `params.nci_gadi_storage` (`--nci_gadi_storage` on the command line) is used to specify the storage locations that the pipeline needs to access. By default, this is set to `gdata/${params.nci_gadi_project}+scratch/${params.nci_gadi_project}`.
5281

53-
The version of Nextflow installed on Gadi has been modified to make it easier to specify resource options for jobs submitted to the cluster. See NCI's [Gadi user guide](https://opus.nci.org.au/display/DAE/Nextflow) for more details. You can manually override the `$PROJECT` specification by editing your local copy of the `nci_gadi.config` and replacing `$PROJECT` with your project code. For example:
54-
55-
```nextflow
56-
process {
57-
executor = 'pbspro'
58-
project = 'aa00'
59-
storage = 'scratch/aa00+gdata/aa00'
60-
...
61-
}
62-
```
82+
Note: The version of Nextflow installed on Gadi has been modified to make it easier to specify resource options for jobs submitted to the cluster through the Nextflow process block (see NCI's [Gadi user guide](https://opus.nci.org.au/display/DAE/Nextflow) for more details). The values specified through the parameters above are passed into the process block in the `nci_gadi.config`.
6383

6484
## Resource usage
6585

66-
The NCI Gadi config summarises resource usage in a custom trace file that will be saved to your execution directory. However, for accounting or resource benchmarking purposes you may need to collect per-task service unit (SU) charges. Upon workflow completion, you can run the Sydney Informatics Hub's [gadi_nfcore_report.sh](https://github.com/Sydney-Informatics-Hub/HPC_usage_reports/blob/master/Scripts/gadi_nfcore_report.sh) script in your workflow execution directory with:
86+
To help monitor the service unit (SU) cost of running workflows on Gadi, a plugin has been developed to generate a report in CSV or JSON format upon workflow completion. The `nf-gadi` plugin is available via the Nextflow plugin registry and can be enabled by adding `-plugins nf-gadi` to your Nextflow run command. See the [plugin project repository](https://github.com/AustralianBioCommons/nf-gadi) for more details.
87+
88+
Additionally, Sydney Informatics Hub also provides a script to collect per-task SU costs. Upon workflow completion, you can run the [gadi_nfcore_report.sh](https://github.com/Sydney-Informatics-Hub/HPC_usage_reports/blob/master/Scripts/gadi_nfcore_report.sh) in your workflow execution directory to collect resources from the PBS log files printed to each task's `.command.log`. Resource requests and usage for each process are summarised in the output `gadi-nf-core-joblogs.tsv` file. To run it, execute the following in your workflow execution directory:
6789

6890
```bash
6991
bash gadi_nfcore_report.sh
7092
```
71-
72-
This script will collect resources from the PBS log files printed to each task's `.command.log`. Resource requests and usage for each process is summarised in the output `gadi-nf-core-joblogs.tsv` file. This is useful for resource benchmarking and SU accounting.

0 commit comments

Comments
 (0)