-
Notifications
You must be signed in to change notification settings - Fork 356
[WIP] Update Cambridge config for current CSD3 partitions #1102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
RaqManzano
merged 12 commits into
nf-core:master
from
RaqManzano:update-cambridge-config
May 5, 2026
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
3439f0b
Update Cambridge config for current CSD3 partitions
RaqManzano 26327a3
Add Cambridge profile owner to CODEOWNERS
RaqManzano 65f2465
Modified Cambridge resource selection to a partition map and refined …
RaqManzano 5528e15
Added Cambridge module setup and SLURM executor tuning
RaqManzano d4ce305
prettier
RaqManzano f1eb218
Update Cambridge docs to use scratch paths
RaqManzano 47b2a78
Refined Cambridge beforeScript logic and polished partition documenta…
RaqManzano 31cc0df
Update docs/cambridge.md
RaqManzano aa15fe2
Updated Cambridge documentation
RaqManzano 88790b5
Refine Cambridge config defaults and expand runtime documentation
RaqManzano b8ab4c9
Merge branch 'master' into update-cambridge-config
RaqManzano 9325477
Merge branch 'master' into update-cambridge-config
RaqManzano File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,28 +1,82 @@ | ||
| // Description is overwritten with user specific flags | ||
| // nf-core/configs: Cambridge CSD3 cluster profile | ||
| // Available partitions: | ||
| // - icelake : 76 CPUs, 256 GB RAM | ||
| // - icelake-himem : 76 CPUs, 512 GB RAM | ||
| // - sapphire : 112 CPUs, 512 GB RAM | ||
| // | ||
| // The profile defaults to the broadly available `icelake` partition but allows | ||
| // users to override it with `--partition`. Walltime is inferred from the SLURM | ||
| // account name when possible: projects containing `-SL3-` use a 12 h cap, | ||
| // otherwise the profile assumes the common SL1/SL2 36 h limit. | ||
|
|
||
| params { | ||
| config_profile_description = 'Cambridge HPC cluster profile.' | ||
| // FIXME EmelineFavreau was the last to edit this | ||
| config_profile_contact = 'Andries van Tonder (ajv37@cam.ac.uk)' | ||
| config_profile_url = "https://docs.hpc.cam.ac.uk/hpc" | ||
| partition = null | ||
| project = null | ||
| max_memory = 192.GB | ||
| max_cpus = 56 | ||
| max_time = 12.h | ||
| config_profile_description = 'Cambridge HPC CSD3 cluster profile.' | ||
| config_profile_contact = 'Raquel Manzano (rm889@cam.ac.uk) and Andries van Tonder (ajv37@cam.ac.uk)' | ||
| config_profile_url = 'https://docs.hpc.cam.ac.uk/hpc' | ||
|
|
||
| partition = 'icelake' | ||
| project = null | ||
|
|
||
| // Compatibility with nf-core schema validation across pipeline versions. | ||
| schema_ignore_params = 'partition,project,max_memory,max_cpus,max_time,csd_time,csd_parts,csd_selected,validationSchemaIgnoreParams' | ||
| validationSchemaIgnoreParams = 'partition,project,max_memory,max_cpus,max_time,csd_time,csd_parts,csd_selected,schema_ignore_params,validationSchemaIgnoreParams' | ||
| } | ||
|
|
||
| validation { | ||
| ignoreParams = ['partition', 'project', 'max_memory', 'max_cpus', 'max_time', 'csd_time', 'csd_parts', 'csd_selected', 'schema_ignore_params', 'validationSchemaIgnoreParams'] | ||
| } | ||
|
|
||
| // Description is overwritten with user specific flags | ||
| params.csd_time = { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can this not go in the main
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I prefer to keep it separate from the main block as it is not really a param for the users. |
||
| params.project?.toUpperCase()?.contains('-SL3-') ? 12.h : 36.h | ||
| }.call() | ||
|
|
||
| params.csd_parts = [ | ||
| icelake : [memory: 256.GB, cpus: 76, time: params.csd_time], | ||
| 'icelake-himem': [memory: 512.GB, cpus: 76, time: params.csd_time], | ||
| sapphire : [memory: 512.GB, cpus: 112, time: params.csd_time] | ||
| ] | ||
|
|
||
| params.csd_selected = { | ||
| def selected = params.csd_parts[params.partition] | ||
|
|
||
| if (!selected) { | ||
| System.err.println("ERROR: cambridge params.partition must be one of 'icelake', 'icelake-himem', or 'sapphire' (got '${params.partition}').") | ||
| System.exit(1) | ||
| } | ||
|
|
||
| selected | ||
| }.call() | ||
|
|
||
| params.max_memory = params.csd_selected.memory | ||
| params.max_cpus = params.csd_selected.cpus | ||
| params.max_time = params.csd_selected.time | ||
|
|
||
| singularity { | ||
| enabled = true | ||
| autoMounts = true | ||
| } | ||
|
|
||
| process { | ||
| resourceLimits = [ | ||
| memory: 192.GB, | ||
| cpus: 56, | ||
| time: 12.h | ||
| ] | ||
| resourceLimits = params.csd_selected | ||
|
|
||
| beforeScript = """ | ||
| . /etc/profile.d/modules.sh | ||
| module purge | ||
| module load rhel8/default-${params.partition == 'sapphire' ? 'sar' : 'icl'} | ||
| """ | ||
|
|
||
| executor = 'slurm' | ||
| clusterOptions = "-A ${params.project} -p ${params.partition}" | ||
| queue = params.partition | ||
| clusterOptions = { params.project ? "--account=${params.project}" : '' } | ||
| cache = 'lenient' | ||
| scratch = true | ||
| } | ||
|
|
||
| executor { | ||
| name = 'slurm' | ||
| queueSize = 200 | ||
| pollInterval = '5 min' | ||
| queueStatInterval = '5 min' | ||
| submitRateLimit = '10 sec' | ||
| exitReadTimeout = '30 min' | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,87 +1,162 @@ | ||
| # nf-core/configs: Cambridge HPC Configuration | ||
|
|
||
| All nf-core pipelines have been successfully configured for use on the Cambridge HPC cluster at the [The University of Cambridge](https://www.cam.ac.uk/). | ||
| To use, run the pipeline with `-profile cambridge`. This will download and launch the [`cambridge.config`](../conf/cambridge.config) which has been pre-configured | ||
| with a setup suitable for the Cambridge HPC cluster. Using this profile, either a docker image containing all of the required software will be downloaded, | ||
| and converted to a Singularity image or a Singularity image downloaded directly before execution of the pipeline. | ||
| All nf-core pipelines can be run on the [Cambridge HPC cluster](https://docs.hpc.cam.ac.uk/hpc/index.html) at the University of Cambridge using `-profile cambridge`. | ||
| This will download and use the [`cambridge.config`](../conf/cambridge.config) | ||
| institutional profile, which is configured for running pipelines on CSD3 with | ||
| Singularity containers. | ||
|
|
||
| ### Install Nextflow | ||
|
|
||
| The latest version of Nextflow is not installed by default on the Cambridge HPC cluster CSD3. You can install it with conda: | ||
| The latest version of Nextflow is not installed by default on CSD3. | ||
|
|
||
| The recommended option is the standard Nextflow self-installing package: | ||
|
|
||
| ``` | ||
| module load miniconda/3 | ||
| # Check that Java 17+ is available | ||
| java -version | ||
|
|
||
| # Download Nextflow | ||
| curl -s https://get.nextflow.io | bash | ||
|
|
||
| # set up Bioconda according to the Bioconda documentation, notably setting up channels | ||
| conda config --add channels defaults | ||
| conda config --add channels bioconda | ||
| conda config --add channels conda-forge | ||
| # Make it executable | ||
| chmod +x nextflow | ||
|
|
||
| # create the environment env_nf, and install the tool nextflow | ||
| conda create --name env_nf nextflow | ||
| # Move it to a personal bin directory in hpc-work | ||
| mkdir -p $HOME/rds/hpc-work/bin | ||
| mv nextflow $HOME/rds/hpc-work/bin/ | ||
|
|
||
| # activate the environment containing nextflow | ||
| conda activate env_nf | ||
| # Add that directory to your PATH if needed | ||
| export PATH="$HOME/rds/hpc-work/bin:$PATH" | ||
|
|
||
| # once done with the environment, deactivate | ||
| conda deactivate | ||
| # Confirm the installation | ||
| nextflow info | ||
| ``` | ||
|
|
||
| Alternatively, you can install Nextflow into a directory you have write access to. | ||
| Follow [these instructions](https://www.nextflow.io/docs/latest/getstarted.html#) from the Nextflow documentation. This alternative method requires also to update java. | ||
| To make the `PATH` change persistent across sessions, add the `export PATH=...` | ||
| line to your `~/.bashrc` or equivalent shell startup file. | ||
|
|
||
| ``` | ||
| # move to desired directory on HPC | ||
| cd /home/<username>/path/to/dir | ||
| See the official installation guide for the latest details and Java | ||
| requirements: | ||
|
|
||
| # get the newest version | ||
| wget -qO- https://get.nextflow.io | bash | ||
| - [nf-core / Nextflow installation guide](https://nf-co.re/docs/usage/getting_started/installation) | ||
|
|
||
| # update java version to the latest | ||
| wget https://download.oracle.com/java/20/latest/jdk-20_linux-x64_bin.tar.gz | ||
| tar xvfz jdk-20_linux-x64_bin.tar.gz | ||
| If you prefer a user-managed package manager, a simple option is to install | ||
| `micromamba` and then follow the nf-core conda-style instructions for creating | ||
| an environment with `nextflow`: | ||
|
|
||
| # if all tools are compatible with the java version you chose, add these lines to .bashrc | ||
| export JAVA_HOME=/home/<username>/path/to/dir/jdk-20.0.1 | ||
| export PATH=/home/<username>/path/to/dir/jdk-20.0.1/bin:$PATH | ||
| - [micromamba installation guide](https://mamba.readthedocs.io/en/stable/installation/micromamba-installation.html) | ||
| - [nf-core conda installation instructions](https://nf-co.re/docs/usage/getting_started/installation#conda-installation) | ||
|
|
||
| # Once above is done `java --version` should return `java 20.0.1 2023-04-18` | ||
| java --version | ||
| `pixi` may also work well for personal environment management; see the | ||
| [pixi documentation](https://pixi.prefix.dev/latest/). However, this profile | ||
| does not currently provide tested `pixi` instructions, so `micromamba` is the | ||
| more conservative recommendation here. | ||
|
|
||
| ### Set up Singularity cache | ||
|
|
||
| Singularity allows the use of containers and will use a caching strategy. First, | ||
| you might want to set the `NXF_SINGULARITY_CACHEDIR` bash environment variable, | ||
| pointing at a directory with sufficient space. If not, it will be | ||
| automatically assigned to the current directory. | ||
|
|
||
| ``` | ||
| # do this once per login, or add this line to .bashrc | ||
| export NXF_SINGULARITY_CACHEDIR=$HOME/rds/hpc-work/nxf-singularity-cache | ||
| ``` | ||
|
|
||
| ### Set up Singularity cache | ||
| On CSD3, Singularity is available by default, so no additional module loading | ||
| should be required. | ||
|
|
||
| Singularity allows the use of containers and will use a caching strategy. First, you might want to set the `NXF_SINGULARITY_CACHEDIR` bash environment variable, pointing at your hpc-work location. If not, it will be automatically assigned to the current directory. | ||
| ### Run Nextflow | ||
|
|
||
| Here is an example with the nf-core pipeline sarek ([read documentation here](https://nf-co.re/sarek/3.3.2)). | ||
| The profile defaults to the `icelake` partition, but users can switch to | ||
| `icelake-himem` or `sapphire` with `--partition`. The user should also provide | ||
| their SLURM project / account with `--project`. | ||
|
|
||
| #### Choosing a partition | ||
|
|
||
| As a rough guide, `icelake` is the default general-purpose choice for most | ||
| workflows. `icelake-himem` is the better option when processes need more memory | ||
| per CPU, for example memory-hungry tasks or jobs using only a small number of | ||
| CPUs but requiring substantial RAM. `sapphire` provides newer Sapphire Rapids | ||
| nodes with 112 CPUs and about 4.5 GiB RAM per CPU (512 GB per node), so it may | ||
| be a better fit for higher-CPU jobs than `icelake`. | ||
|
|
||
| #### Example | ||
|
|
||
| ``` | ||
| # do this once per login, or add these lines to .bashrc | ||
| export NXF_SINGULARITY_CACHEDIR=/home/<username>/rds/hpc-work/path/to/cache/dir | ||
| # Launch the nf-core pipeline for a test database | ||
| # with the Cambridge profile | ||
| nextflow run nf-core/sarek -profile test,cambridge --partition icelake --project NAME-SL2-CPU --outdir nf-sarek-test | ||
| ``` | ||
|
|
||
| Once done, and ready to use Nextflow, you can check that the Singularity module is loaded by default when logging on the cluster. | ||
| If the project name contains `-SL3-`, the profile applies a 12 h walltime cap. | ||
| Otherwise it assumes the standard SL1 / SL2 36 h limit. | ||
|
|
||
| #### Running Nextflow on CSD3 | ||
|
|
||
| We recommend starting Nextflow inside a `screen` or `tmux` | ||
| session so that the Nextflow manager process keeps running after you disconnect | ||
| your SSH session. | ||
|
|
||
| ``` | ||
| module list | ||
| # Start a tmux session | ||
| tmux new -s nextflow | ||
|
|
||
| # Or start a screen session | ||
| screen -S nextflow | ||
|
|
||
|
RaqManzano marked this conversation as resolved.
|
||
| # If singularity is not loaded: | ||
| module load singularity | ||
| # Re-attach later if needed | ||
| tmux attach -t nextflow | ||
| screen -r nextflow | ||
| ``` | ||
|
|
||
| ### Run Nextflow | ||
| Detaching from `tmux` leaves the workflow running in the background with | ||
| `Ctrl-b` then `d`. For `screen`, use `Ctrl-a` then `d`. | ||
|
|
||
|
RaqManzano marked this conversation as resolved.
|
||
| Here is an example with the nf-core pipeline sarek ([read documentation here](https://nf-co.re/sarek/3.3.2)). | ||
| The user includes the project name and the node. | ||
| You can then logout of the HPC and reattach to the session later. | ||
| Before logging out, make sure to **note the node you’re on**. | ||
| Suppose your login node was called `login-p-3`, you can later log back into this specific node as follows: | ||
|
|
||
| ```bash | ||
| ssh username@login-p-3.hpc.cam.ac.uk | ||
| ``` | ||
| # Launch the nf-core pipeline for a test database | ||
| # with the Cambridge profile | ||
| nextflow run nf-core/sarek -profile test,cambridge.config --partition "cclake" --project "NAME-SL3-CPU" --outdir nf-sarek-test | ||
|
|
||
| Then, you can re-attach to the `tmux`/`screen` session: | ||
|
|
||
| ```bash | ||
| tmux attach -t nextflow | ||
| screen -r nextflow | ||
| ``` | ||
|
|
||
| All of the intermediate files required to run the pipeline will be stored in the `work/` directory. It is recommended to delete this directory after the pipeline | ||
| has finished successfully because it can get quite large, and all of the main output files will be saved in the `results/` directory anyway. | ||
| #### Limit Nextflow JVM memory (recommended) | ||
|
|
||
| If needed, you can limit the memory used by the Nextflow manager process by | ||
| setting: | ||
|
|
||
| ```bash | ||
| export NXF_JVM_ARGS='-Xms2g -Xmx4g' | ||
| ``` | ||
|
|
||
| This is a conservative example that should work for most runs. If the Nextflow | ||
| manager process still runs into memory errors, increase `-Xmx` accordingly. | ||
| This must be set **before** launching `nextflow run ...`. If you want to use | ||
| this setting by default, you can add the export line to your `~/.bashrc`. | ||
|
|
||
| #### Large runs | ||
|
|
||
| For large runs, for example workflows with many samples or many tasks, the | ||
| Nextflow manager process can itself use substantial memory. In those cases, it | ||
| is better to launch `nextflow run ...` inside an interactive `srun` session or | ||
| submit it via `sbatch`, rather than running it directly on a login node. | ||
|
|
||
| #### `work` directory | ||
|
|
||
| All of the intermediate files required to run the pipeline will be stored in | ||
| the `work/` directory. It is recommended to **delete** this directory after the | ||
| pipeline has finished successfully because it can get quite large, and all of | ||
| the main output files will be saved in the `--outdir` directory anyway. | ||
|
|
||
| > NB: You will need an account to use the Cambridge HPC cluster in order to run the pipeline. If in doubt contact IT. | ||
| > NB: Nextflow will need to submit the jobs via SLURM to the Cambridge HPC cluster and as such the commands above will have to be executed on one of the login | ||
| > nodes. If in doubt contact IT. | ||
| > NB: Nextflow will need to submit the jobs via SLURM to the Cambridge HPC cluster and as such the commands above will have to be executed on one of the login nodes. If in doubt contact IT. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these two really needed, line 26 seems to cover these already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could but this is to be more compatible with older pipelines that still look for
schema_ignore_paramsandvalidationSchemaIgnoreParams