Skip to content

Fix env fail timeout#78

Open
itkovian wants to merge 50 commits intohpcugent:24.11.ugfrom
itkovian:fix-env-fail-timeout
Open

Fix env fail timeout#78
itkovian wants to merge 50 commits intohpcugent:24.11.ugfrom
itkovian:fix-env-fail-timeout

Conversation

@itkovian
Copy link
Copy Markdown
Member

  • raise timeout to 600s
  • bump to 24.11.7

agilmor and others added 30 commits August 21, 2025 14:53
This is a continuation of commit a2cb1bd.

Ticket: 23495
Cherry-picked: 965bba8
See merge request SchedMD/dev/slurm!1943
Issue: 50186
Issue: 50420
Cherry-picked: d492dcc
See merge request SchedMD/dev/slurm!1947
This is a continuation of commits a2cb1bd and 965bba8.

Ticket: 23495
Cherry-picked: 0b518b7
See merge request SchedMD/dev/slurm!1962
This should be useful to check version that C code will link against.

Issue: 50420
Cherry-picked: a1d164e
This is a continuation of commit d492dcc.

Issue: 50420
Cherry-picked: fefa264
See merge request SchedMD/dev/slurm!1966
Ticket: 23355
Issue: 50637
Cherry-picked: 3bcba22
See merge request SchedMD/dev/slurm!1973
Sometimes squeue may print the components of a het job in a different
order, so we should not relay on that order.

Ticket: 14554
Cherry-picked: 1eb4b81
See merge request SchedMD/dev/slurm!1983
The HetJobId is not present in all jobs, so when the test is
run after other tests we may have jobs in the system without it.

Ticket: 14554
Cherry-picked: c1d5c90
See merge request SchedMD/dev/slurm!1994
See merge request SchedMD/dev/slurm!2008
srun jobs would occasionally surpass the 60 second timeout defined in
test_srun_ports_in_range, causing the following tests to fail due to
nodes still being in use. Extending the timeout to 120 should prevent
the test from failing.

We are also updating the number of configured ports to improve the
test log messages.

Issue: 50548
Ticket: 19089
Cherry-picked: 7590dc6
Documentation was updated in commit fde3dfb.

Ticket: 17619
Cherry-picked: 13d46af
See merge request SchedMD/dev/slurm!2011
The issue was fixed in 24.11.1.

Ticket: 10366
Cherry-picked: e81159e
This is a continuation of af04e3e.

Ticket: 23231
Cherry-picked: 03ca47e
See merge request SchedMD/dev/slurm!2036
See merge request SchedMD/dev/slurm!2021
See merge request SchedMD/dev/slurm!2041
agilmor and others added 20 commits September 4, 2025 08:04
See merge request SchedMD/dev/slurm!2059
See merge request SchedMD/dev/slurm!2073
See merge request SchedMD/dev/slurm!2081
See merge request SchedMD/dev/slurm!2097
See merge request SchedMD/dev/slurm!2124
Before when deleting a host range, hostlist_shift_iterators() would set the
i->idx to the host range before the deleted one. It also would not reset
the i->depth to -1. For example, if you have a hostlist "a[1-2],b1,c1", and
hostlist_remove() was called while the hostlist iterator was on b1, the
next call to hostlist_next() would return a2 when it should be c1.

This make it so if the host range i->idx was referencing was deleted then
i->idx will be set to the new index referencing the host range after the
one that was deleted, and it will reset the depth.

This issue could cause memory corruption in _get_alias_addrs() when writing
to forward->alias_addrs.node_addrs[addr_index] because the hostlist
iterator loop could repeat host ranges leading to addr_index going out of
bounds.

Ticket: 23676
Co-authored-by: Megan Dahl <megan@schedmd.com>
Changelog: Prevent potential memory corruption while forwarding messages
 that require addresses to be packed.
Cherry-picked: c2c01c5
Test iterating through hostlist iterators to verify every host is seen and
not repeated. This also test iterating while deleting hosts from the list.

Patch includes run of 'autoreconf -i'.

Ticket: 23676
Cherry-picked: 8319188
See merge request SchedMD/dev/slurm!2339
Ticket: 24012
Changelog: slurmctld - Prevent a fatal when min_exempt_priority is not
 the last option listed in PreemptParameters.
Cherry-picked: 4db2cf6
See merge request SchedMD/dev/slurm!2531
Update slurm.spec and debian/changelog as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.