{lib}[GCCcore/10.2.0] OpenMPI v4.0.5, libevent v2.1.12, libfabric v1.11.0, PMIx 3.1.5#11333
Conversation
|
@boegelbot please test @ generoso |
|
@boegel: Request for testing this PR well received on generoso PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 697295413 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegelbot |
|
Test report by @boegel |
|
Test report by @boegel |
|
Test report by @boegel |
…asyconfigs into 20200923113619_new_pr_OpenMPI405
|
Test report by @lexming |
|
Test report by @lexming |
lexming
left a comment
There was a problem hiding this comment.
This OpenMPI is not working well on my side. A simple MPI hello world program fails to initialise OpenFabrics
$ mpirun ./test
[node379.hydra.os:24944] [[51950,0],0] ORTE_ERROR_LOG: Out of resource in file util/show_help.c at line 501
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: node378
Local device: mlx5_0
--------------------------------------------------------------------------
Hello world from processor node379.hydra.os, rank 0 out of 2 processors
Hello world from processor node378.hydra.os, rank 1 out of 2 processors
OSU-Micro-benchmarks has the same issue
# OSU MPI Latency Test v5.6.3
# Size Latency (us)
1024 2.08
2048 2.83
4096 3.72
8192 5.46
16384 7.56
32768 9.83
65536 14.34
131072 22.34
262144 32.28
524288 54.46
1048576 97.91
2097152 181.33
4194304 354.37
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: node378
Local device: mlx5_0
--------------------------------------------------------------------------
[node379.hydra.os:15539] [[38701,0],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/show_help.c at line 501
The execution completes in both cases, but those errors are not good.
|
Started a test build on a "clean" arm box. It'll take a bit. It started building M4... The box has no toolchains. :) |
|
Test report by @terjekv |
The problem here is that we should be configuring OpenMPI with Please try again with the updated OpenMPI easyblock from easybuilders/easybuild-easyblocks#2188 . |
|
Test report by @lexming |
|
Test report by @lexming |
|
@boegel thanks a lot, that was indeed the issue. We have been already disabling verbs in our production system, but I was totally misled by the |
|
@lexming So let's merge? Or do you want to see more tests? |
|
Test report by @boegel |
|
Going in, thanks @boegel ! |
(created using
eb --new-pr)requires
easybuilders/easybuild-easyblocks#2184+#11320(UCX) +#11332(hwloc)