Skip to content

{perf}[gompi/2025b] HPCToolkit v2025.0.1 w/ CUDA 12.9.1#23830

Merged
Micket merged 18 commits intoeasybuilders:developfrom
Thyre:20250909144251_new_pr_HPCToolkit202500
Sep 16, 2025
Merged

{perf}[gompi/2025b] HPCToolkit v2025.0.1 w/ CUDA 12.9.1#23830
Micket merged 18 commits intoeasybuilders:developfrom
Thyre:20250909144251_new_pr_HPCToolkit202500

Conversation

@Thyre
Copy link
Copy Markdown
Collaborator

@Thyre Thyre commented Sep 9, 2025

(created using eb --new-pr)

Requires:

TODO:

  • Test on aarch64
  • Enable tests and run them
  • Check if we can move some of the dependencies to builddependencies

HPCViewer, while also interesting to have, needs to wait until we have GTK3/GTK4. This is blocked by a few more missing dependencies for now.

@Thyre Thyre added the new label Sep 9, 2025
@Thyre Thyre marked this pull request as draft September 9, 2025 12:43
@Thyre Thyre added the 2025b issues & PRs related to 2025b common toolchains label Sep 9, 2025
@github-actions github-actions Bot added update and removed new labels Sep 9, 2025
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 9, 2025

Updated software HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb

Diff against HPCToolkit-2025.0.1-gompi-2025b.eb

easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b.eb

diff --git a/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b.eb b/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb
index ce71f009ef..46a7b253f5 100644
--- a/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b.eb
+++ b/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb
@@ -2,6 +2,7 @@ easyblock = 'MesonNinja'
 
 name = 'HPCToolkit'
 version = '2025.0.1'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://hpctoolkit.org/'
 description = """
@@ -41,6 +42,7 @@ builddependencies = [
 
 dependencies = [
     ('Boost', '1.88.0'),
+    ('CUDA', '12.9.1', '', SYSTEM),
     ('Dyninst', '13.0.0'),
     ('PAPI', '7.2.0'),
     ('Xerces-C++', '3.3.0'),
@@ -54,7 +56,7 @@ if ARCH == 'x86_64':
     dependencies.append(('intel-XED', '2025.06.08'))
 
 configopts = "-Dtests=enabled -Dmanpages=disabled -Dmanual=disabled -Dhpcprof_mpi=enabled "
-configopts += "-Dpapi=enabled -Dcuda=disabled -Dgtpin=disabled -Drocm=disabled -Dopencl=disabled "
+configopts += "-Dpapi=enabled -Dcuda=enabled -Dgtpin=disabled -Drocm=disabled -Dopencl=disabled "
 configopts += "-Dvalgrind_annotations=false -Dpython=disabled "
 
 # Explicitly ensure that no environment variables related to jobs are picked up, as this
@@ -71,6 +73,12 @@ local_env_to_unset = [
 pretestopts = ""
 for local_env in local_env_to_unset:
     pretestopts += f"unset {local_env} && "
+
+build_info_msg = """CUDA related tests can fail due to insufficient permissions for CUPTI PC Sampling.
+The final installation is still usable, but with reduced functionality.
+See: https://hpctoolkit.gitlab.io/hpctoolkit/2025.0.1/users/faq.html
+in section 'Ensuring permission to use GPU performance counters'
+Building with '--ignore-test-failure' still results in a working installation"""
 runtest = "meson test"
 
 local_preload_libs = [
@@ -81,11 +89,10 @@ sanity_check_paths = {
     'files':
         ['bin/hpcrun', 'include/hpctoolkit.h'] +
         ['lib/libhpcrun_preload_%s.%s' % (a, SHLIB_EXT) for a in local_preload_libs] +
-        [f'lib/libhpcrun.{SHLIB_EXT}', f'lib/libhpctoolkit.{SHLIB_EXT}'],
+        [f'lib/libhpcrun.{SHLIB_EXT}', f'lib/libhpctoolkit.{SHLIB_EXT}', f'lib/libhpcrun_dlopen_nvidia.{SHLIB_EXT}'],
     'dirs': [],
 }
 
-# hpcstructs version command exits with 1, so grep for the version instead
 sanity_check_commands = [
     'hpcrun --version',
     'hpcstruct --version | grep %(version)s',

Updated software HPCToolkit-2025.0.1-gompi-2025b.eb

Diff against HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb

easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb

diff --git a/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb b/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b.eb
index 46a7b253f5..ce71f009ef 100644
--- a/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb
+++ b/easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b.eb
@@ -2,7 +2,6 @@ easyblock = 'MesonNinja'
 
 name = 'HPCToolkit'
 version = '2025.0.1'
-versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://hpctoolkit.org/'
 description = """
@@ -42,7 +41,6 @@ builddependencies = [
 
 dependencies = [
     ('Boost', '1.88.0'),
-    ('CUDA', '12.9.1', '', SYSTEM),
     ('Dyninst', '13.0.0'),
     ('PAPI', '7.2.0'),
     ('Xerces-C++', '3.3.0'),
@@ -56,7 +54,7 @@ if ARCH == 'x86_64':
     dependencies.append(('intel-XED', '2025.06.08'))
 
 configopts = "-Dtests=enabled -Dmanpages=disabled -Dmanual=disabled -Dhpcprof_mpi=enabled "
-configopts += "-Dpapi=enabled -Dcuda=enabled -Dgtpin=disabled -Drocm=disabled -Dopencl=disabled "
+configopts += "-Dpapi=enabled -Dcuda=disabled -Dgtpin=disabled -Drocm=disabled -Dopencl=disabled "
 configopts += "-Dvalgrind_annotations=false -Dpython=disabled "
 
 # Explicitly ensure that no environment variables related to jobs are picked up, as this
@@ -73,12 +71,6 @@ local_env_to_unset = [
 pretestopts = ""
 for local_env in local_env_to_unset:
     pretestopts += f"unset {local_env} && "
-
-build_info_msg = """CUDA related tests can fail due to insufficient permissions for CUPTI PC Sampling.
-The final installation is still usable, but with reduced functionality.
-See: https://hpctoolkit.gitlab.io/hpctoolkit/2025.0.1/users/faq.html
-in section 'Ensuring permission to use GPU performance counters'
-Building with '--ignore-test-failure' still results in a working installation"""
 runtest = "meson test"
 
 local_preload_libs = [
@@ -89,10 +81,11 @@ sanity_check_paths = {
     'files':
         ['bin/hpcrun', 'include/hpctoolkit.h'] +
         ['lib/libhpcrun_preload_%s.%s' % (a, SHLIB_EXT) for a in local_preload_libs] +
-        [f'lib/libhpcrun.{SHLIB_EXT}', f'lib/libhpctoolkit.{SHLIB_EXT}', f'lib/libhpcrun_dlopen_nvidia.{SHLIB_EXT}'],
+        [f'lib/libhpcrun.{SHLIB_EXT}', f'lib/libhpctoolkit.{SHLIB_EXT}'],
     'dirs': [],
 }
 
+# hpcstructs version command exits with 1, so grep for the version instead
 sanity_check_commands = [
     'hpcrun --version',
     'hpcstruct --version | grep %(version)s',

Comment thread easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.0-gompi-2025b.eb Outdated
Thyre and others added 6 commits September 9, 2025 14:51
Fail tests due to incorrect values

Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
May pick up system OpenCL headers.

Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

Test report by @Thyre
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
ZAM054 - Linux Zorin OS 17, x86_64, 12th Gen Intel(R) Core(TM) i7-1260P, 1 x NVIDIA NVIDIA GeForce MX550, 580.65.06, Python 3.10.12
See https://gist.github.com/Thyre/4de99d300e14b77888b5cacbbc8cbf50 for a full test report.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

hpctoolkit 2025.0.0

  Subprojects
    hpctesttool-wheels  : YES
    md5-g72cf2cd        : YES
    valgrind-headers    : YES

  User defined options
    b_ndebug            : true
    cuda                : enabled
    gtpin               : disabled
    hpcprof_mpi         : enabled
    libdir              : lib
    manpages            : disabled
    manual              : disabled
    opencl              : disabled
    optimization        : 2
    papi                : enabled
    prefix              : /opt/EasyBuild/apps/software/HPCToolkit/2025.0.0-gompi-2025b-CUDA-12.9.1
    python              : disabled
    rocm                : disabled
    tests               : enabled
    valgrind_annotations: false

The three mentioned subprojects are all part of the HPCToolkit sources. So it shouldn't try to download anything fortunately.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

Test report by @Thyre
FAILED
Build succeeded for 1 out of 3 (2 easyconfigs in total)
jrc0900.jureca - Linux Rocky Linux 9.6, AArch64, ARM UNKNOWN, 1 x NVIDIA NVIDIA GH200 480GB, 580.65.06, Python 3.9.21
See https://gist.github.com/Thyre/8f816e53070a7ffd6aac94a191ed93cb for a full test report.

Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

Test report by @Thyre
FAILED
Build succeeded for 1 out of 3 (2 easyconfigs in total)
jrc0900.jureca - Linux Rocky Linux 9.6, AArch64, ARM UNKNOWN, 1 x NVIDIA NVIDIA GH200 480GB, 580.65.06, Python 3.9.21
See https://gist.github.com/Thyre/8f816e53070a7ffd6aac94a191ed93cb for a full test report.

Tests here failed because I was building on a node, which was picked up and is not correctly handled by the check.
I also tried on a login node of JUPITER. Here, tests failed because of insufficient permissions for PCSampling.
I would propose to keep the tests enabled, but add a comment noting which failures one can expect to see.

@Thyre Thyre marked this pull request as ready for review September 9, 2025 14:35
@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

@boegelbot please test @ jsc-zen3-a100

@boegelbot
Copy link
Copy Markdown
Collaborator

@Thyre: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=23830 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_23830 --ntasks=8 --partition=jsczen3g --gres=gpu:1 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 7898

Test results coming soon (I hope)...

Details

- notification for comment with ID 3271028326 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

HPCToolkit-2025.0.0-gompi-2025b.eb

Test report by @Thyre
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
jpbl-s01-04 - Linux RHEL 9.5, AArch64, ARM UNKNOWN (neoverse_v2), 1 x NVIDIA NVIDIA GH200 480GB, 570.133.20, Python 3.9.21
See https://gist.github.com/Thyre/acd8bb52177cbf4e51e17fc5ee2e6e54 for a full test report.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

Test report by @Thyre
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
ZAM054 - Linux Zorin OS 17, x86_64, 12th Gen Intel(R) Core(TM) i7-1260P, 1 x NVIDIA NVIDIA GeForce MX550, 580.65.06, Python 3.10.12
See https://gist.github.com/Thyre/e21462e1cd3d43e946a8eb033cb00b7d for a full test report.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

HPCToolkit-2025.0.0-gompi-2025b-CUDA-12.9.1.eb
Missing permissions for CUPTI PC Sampling

Test report by @Thyre
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
jpbl-s01-04 - Linux RHEL 9.5, AArch64, ARM UNKNOWN (neoverse_v2), 1 x NVIDIA NVIDIA GH200 480GB, 570.133.20, Python 3.9.21
See https://gist.github.com/Thyre/f5d7ebda39b0ea68f1502a94613a1fbf for a full test report.

Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
@boegelbot
Copy link
Copy Markdown
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
jsczen3g1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.6, x86_64, AMD EPYC-Milan Processor (zen3), 1 x NVIDIA NVIDIA A100 80GB PCIe, 575.57.08, Python 3.9.21
See https://gist.github.com/boegelbot/7e4342c56c227bf4df6e966833e9bbbb for a full test report.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 9, 2025

Test report by boegelbot
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
jsczen3g1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.6, x86_64, AMD EPYC-Milan Processor (zen3), 1 x NVIDIA NVIDIA A100 80GB PCIe, 575.57.08, Python 3.9.21
See https://gist.github.com/boegelbot/7e4342c56c227bf4df6e966833e9bbbb for a full test report.

Yeah... that's HPCToolkit picking up that we're running on a compute node. The CUDA tests also fail with insufficient permissions.

Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
@Thyre Thyre changed the title {perf}[gompi/2025b] HPCToolkit v2025.0.0 w/ CUDA 12.9.1 {perf}[gompi/2025b] HPCToolkit v2025.0.1 w/ CUDA 12.9.1 Sep 10, 2025
@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 10, 2025

Test report by @Thyre
FAILED
Build succeeded for 0 out of 2 (2 easyconfigs in total)
jrc0900.jureca - Linux Rocky Linux 9.6, AArch64, ARM UNKNOWN, 1 x NVIDIA NVIDIA GH200 480GB, 580.65.06, Python 3.9.21
See https://gist.github.com/Thyre/14366173be5b4a5eee08bf7a96c13d65 for a full test report.

stripped

See https://gitlab.com/hpctoolkit/hpctoolkit/-/merge_requests/1310 for
more information on why this is done.

Signed-off-by: Jan André Reuter <j.reuter@fz-juelich.de>
@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 10, 2025

Test report by @Thyre
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
ZAM054 - Linux Zorin OS 17, x86_64, 12th Gen Intel(R) Core(TM) i7-1260P, 1 x NVIDIA NVIDIA GeForce MX550, 580.65.06, Python 3.10.12
See https://gist.github.com/Thyre/7bc822c132f5f3581fdfd88afa6a1c8f for a full test report.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 10, 2025

Test report by @Thyre
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
jrc0900.jureca - Linux Rocky Linux 9.6, AArch64, ARM UNKNOWN, 1 x NVIDIA NVIDIA GH200 480GB, 580.65.06, Python 3.9.21
See https://gist.github.com/Thyre/ce73e0ce1c6578b00795ec7657a63d75 for a full test report.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 10, 2025

@boegelbot please test @ jsc-zen3-a100

@boegelbot
Copy link
Copy Markdown
Collaborator

@Thyre: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=23830 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_23830 --ntasks=8 --partition=jsczen3g --gres=gpu:1 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 7920

Test results coming soon (I hope)...

Details

- notification for comment with ID 3275022689 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Copy Markdown
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 1 out of 2 (2 easyconfigs in total)
jsczen3g1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.6, x86_64, AMD EPYC-Milan Processor (zen3), 1 x NVIDIA NVIDIA A100 80GB PCIe, 575.57.08, Python 3.9.21
See https://gist.github.com/boegelbot/5c3002200a2328ef0f0b611f6cbd60fd for a full test report.

Comment thread easybuild/easyconfigs/h/HPCToolkit/HPCToolkit-2025.0.1-gompi-2025b-CUDA-12.9.1.eb Outdated
Co-authored-by: Simon Branford <4967+branfosj@users.noreply.github.com>
@branfosj
Copy link
Copy Markdown
Member

Test report by @branfosj
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0105u03a - Linux RHEL 8.10, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/branfosj/8cd9c47c4504fdc60fcef782bda01190 for a full test report.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 11, 2025

I created easybuilders/easybuild-framework#4996 to keep track of how we could improve the rpath sanity check.

@branfosj
Copy link
Copy Markdown
Member

Test report by @branfosj
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0208u23a - Linux RHEL 8.10, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), 1 x NVIDIA NVIDIA A100-SXM4-40GB, 560.35.05, Python 3.6.8
See https://gist.github.com/branfosj/0ff607d4f7e35428e8d00ee6aefbf118 for a full test report.

@branfosj
Copy link
Copy Markdown
Member

branfosj commented Sep 11, 2025

Recording the test failures when performance counters are not available here, so that others can see them without having to search the logs.

jsc-zen3 has 6 failures:

  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC sampling produces profiles (gpu=nvidia)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + tracing produces profiles (gpu=nvidia)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + boosted tracing produces profiles (gpu=nvidia)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC sampling produces profiles (gpu=cuda)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + tracing produces profiles (gpu=cuda)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + boosted tracing produces profiles (gpu=cuda)

I have 12 - the above 6 and also:

  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ tracing produces profiles (gpu=cuda)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ boosted tracing produces profiles (gpu=cuda)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ boosted tracing produces profiles (gpu=nvidia)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda produces profiles (gpu=nvidia)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda produces profiles (gpu=cuda)
  • hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ tracing produces profiles (gpu=nvidia)

The ordering of the tests changes, but these pass on jsc-zen3 (tests 57-63). Not sure what difference causes the extra 6 to fail. I know my system has the relevant performance counters disabled.

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 11, 2025

Recording the test failures when performance counters are not available here, so that others can see them without having to search the logs.

jsc-zen3 has 6 failures:

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC sampling produces profiles (gpu=nvidia)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + tracing produces profiles (gpu=nvidia)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + boosted tracing produces profiles (gpu=nvidia)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC sampling produces profiles (gpu=cuda)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + tracing produces profiles (gpu=cuda)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ PC + boosted tracing produces profiles (gpu=cuda)`

I have 12 - the above 6 and also:

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ tracing produces profiles (gpu=cuda)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ boosted tracing produces profiles (gpu=cuda)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ boosted tracing produces profiles (gpu=nvidia)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda produces profiles (gpu=nvidia)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda produces profiles (gpu=cuda)`

* `hpctoolkit:hpcrun+cuda / Measurement of tstexe-vecadd-cuda w/ tracing produces profiles (gpu=nvidia)`

The ordering of the tests changes, but these pass on jsc-zen3 (tests 57-63). Not sure what difference causes the extra 6 to fail. I know my system has the relevant performance counters disabled.

The other 6 are very likely caused by the CUDA driver. We also see that with 570 on JUPITER, but don't see it on our other GH200 nodes with 580.

@branfosj
Copy link
Copy Markdown
Member

The other 6 are very likely caused by the CUDA driver. We also see that with 570 on JUPITER, but don't see it on our other GH200 nodes with 580.

We have 560.35.05 so that fits the pattern.

@Micket
Copy link
Copy Markdown
Contributor

Micket commented Sep 15, 2025

Test report by @Micket
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
vera-r07-03 - Linux Rocky Linux 9.6, x86_64, Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz, 4 x NVIDIA NVIDIA A40, 580.65.06, Python 3.9.21
See https://gist.github.com/Micket/a6b70f0150495b9f148cf4d8c327afb6 for a full test report.

@Micket
Copy link
Copy Markdown
Contributor

Micket commented Sep 15, 2025

Test report by @Micket
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
vera-r07-03 - Linux Rocky Linux 9.6, x86_64, Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz, 4 x NVIDIA NVIDIA A40, 580.65.06, Python 3.9.21
See https://gist.github.com/Micket/66c37f3c04d631ad3ac1aef3438be6a2 for a full test report.

@Micket
Copy link
Copy Markdown
Contributor

Micket commented Sep 15, 2025

I'd like to get this merged.
While we could spend time, writing a custom easyblock to exclude tests that fail.. maybe they should just fail? It would tell the person that in order to fully use this software, they should update their drivers, or need to consider enabling performance counters. If not, they can always opt to accept the limitations and ignore the failing tests?

Anyone strongly disagrees?

@Thyre
Copy link
Copy Markdown
Collaborator Author

Thyre commented Sep 15, 2025

I'd like to get this merged. While we could spend time, writing a custom easyblock to exclude tests that fail.. maybe they should just fail? It would tell the person that in order to fully use this software, they should update their drivers, or need to consider enabling performance counters. If not, they can always opt to accept the limitations and ignore the failing tests?

Anyone strongly disagrees?

I think with the build message that now appears, users are informed enough to either work around the problem or install it with --ignore-test-failure knowing that some features might not work fully. We can certainly clean this up a bit more, but the EasyConfig is still readable enough that we can merge this as an EasyConfig only for now I'd say.

@Micket Micket added this to the release after 5.1.2 (5.2.0?) milestone Sep 16, 2025
Copy link
Copy Markdown
Contributor

@Micket Micket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Micket
Copy link
Copy Markdown
Contributor

Micket commented Sep 16, 2025

Going in, thanks @Thyre!

@Micket Micket merged commit 9c082a7 into easybuilders:develop Sep 16, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2025b issues & PRs related to 2025b common toolchains update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants