enable hwloc, cuFFTMp, and HeFFTe support in GROMACS easyblock#3531
enable hwloc, cuFFTMp, and HeFFTe support in GROMACS easyblock#3531bedroge wants to merge 10 commits intoeasybuilders:developfrom
Conversation
There was a problem hiding this comment.
This looks good, matches what I can find in the docs. My only concern is that there is no version-checking with the new options. I think the hwloc one is there since 2016, but the HEFFTE one is more recent (I can't quite figure it out but I think it is 2023, see https://gitlab.com/gromacs/gromacs/-/issues/4090). For our own use case we could make the check be more recent than when it was first supported.
EDIT: Indeed, heffte seems to first appear in 2023.1: https://manual.gromacs.org/2023.1/install-guide/index.html
EDIT: The option for hwloc is first documented in 2016.4: https://manual.gromacs.org/2016.4/install-guide/index.html
Thanks! I added the version checks, hadn't seen your edits. But I also see hwloc being mentioned in the 2016.1 docs, and it's also in the code: HeFFTe is being mentioned in the 2023 docs (https://manual.gromacs.org/current/release-notes/2023/major/performance.html#pme-decomposition-support-with-cuda-and-sycl-backends), and also in the CMake file for 2023 (https://gitlab.com/gromacs/gromacs/-/blob/v2023/CMakeLists.txt?ref_type=tags#L741) and 2023.1 (https://gitlab.com/gromacs/gromacs/-/blob/v2023.1/CMakeLists.txt?ref_type=tags#L749). |
|
One thing I was a little bit worried about is that the HeFFTe installation requires a GPU (for the tests), hence simply installing GROMACS and its dependencies will also require a GPU if we enable this by default. Should we make it optional in some way (commenting out the HeFFTe depdendency or disabling the tests)? For EESSI it would currently already cause an issue, as we build on nodes without GPUs. |
|
Hi! Don't want to derail the discussion here, but, while I don't have any recent numbers, the situation has not changed much from what NVIDIA reports in their blog:
HeFFTe has the benefit of supporting AMD and Intel GPUs, but it's not the best choice for CUDA installation. cuFFTMp has its own share of issues, as @bedroge outlined in the PR description, but I think a performance difference is relevant for evaluating which effort is more worthwhile. Regarding versioning, can confirm that heffte (and cufftmp) were added in 2023, and hwloc was added in 2016. |
Thanks for your input, it's definitely a fair point. I initially added only HeFFTe support in this PR, as it seemed like a more logical default option (e.g. no additional requirements on the hardware like with cuFFTMp), and a first attempt at adding cuFFTMp support failed miserably 😅 But I can have another look at it, ultimately it would be nice if the easyblock supports both, and people can choose between the two of them. |
|
Now that we have an easyconfig for cuFFTMp (https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/c/cuFFTMp/cuFFTMp-11.4.0-gompi-2025b-CUDA-12.9.1.eb), it's trivial to add support for it. I've done that in 23eda47. Initially it didn't compile because it was picking up the I couldn't really force it to use the header provided by cuFFTMp first (moving it up or down in the deps list didn't work), but adding |
|
Assuming you cannot select the multi-GPU FFT library at runtime, we would need to select them at build time. We could do that by having an easyconfig parameter that controls this, or by having separate easyconfigs with a corresponding version suffix? |
|
@boegelbot please test @ jsc-zen3-a100 |
|
@bedroge: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 4336494167 processed Message to humans: this is just bookkeeping information for me, |
|
@boegelbot please test @ jsc-zen3 |
|
@bedroge: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 4336609134 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegelbot Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 1 (total: 28 mins 39 secs) (1 easyconfigs in total) |
|
Test report by @boegelbot Overview of tested easyconfigs (in order)
Build succeeded for 2 out of 2 (total: 1 hour 11 mins 15 secs) (2 easyconfigs in total) |
Hmm, forgot to actually include that fix, but I just pushed it (cfb6f93). |
|
@boegelbot please test @ jsc-zen3-a100 |
|
@boegelbot please test @ jsc-zen3-a100 |
|
@ocaisa: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 4339145797 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegelbot Overview of tested easyconfigs (in order)
Build succeeded for 1 out of 1 (total: 28 mins 4 secs) (1 easyconfigs in total) |
In EESSI we noticed that GROMACS builds currently show the following with
gmx -version:Hwloc is part of the foss toolchain and can be easily enabled.
For Multi-GPU FFT support either cuFFTMp (https://manual.gromacs.org/documentation/current/install-guide/index.html#using-cufftmp) or HeFFTe (https://manual.gromacs.org/documentation/current/install-guide/index.html#using-heffte) is required. I was trying to add support for both, but cuFFTMp is part of NVHPC, and simply adding that as dependency will make GROMACS pick up other stuff from that installation (e.g. OpenMP libraries). Since cuFFTMp also imposes some additional requirements (see https://docs.nvidia.com/hpc-sdk/cufftmp/usage/requirements.html), I've only added HeFFTe support for now. I've also just opened an easyconfigs PR for HeFFTe with CUDA support: easybuilders/easybuild-easyconfigs#22024. Once that's merged, I'll make another to add this as a dependency to CUDA versions of GROMACS.