When a cluster uses OpenZFS for the $HOME of users, the SSH keys should be stored on that directory and not Lustre.
The lifecycle_script.py file always creates the SSH key pair under the Lustre file system and only creates a symbolic link to OpenZFS. The problem with this approach is that ZFS is the actual $HOME of the users, not Lustre. If you want to delete the cluster and keep the $HOME directories, those keys are deleted. If you want to give users access to 2 different cluster but keep the same $HOME, and then you want to delete the first cluster created, you can't because the keys are stored on the original Lustre.
The behaviour should be:
- If there is no ZFS, then creates the
.ssh directory natively on Lustre. Generate the key pair.
- If there is a ZFS mounted as the $HOME directory, create the
.ssh directory natively on ZFS and not Lustre.
- As ZFS gets mounted on all compute nodes, Slurm doesn't need it on Lustre too to be able to SSH into the compute nodes.
The issue can be found here.
When a cluster uses OpenZFS for the $HOME of users, the SSH keys should be stored on that directory and not Lustre.
The
lifecycle_script.pyfile always creates the SSH key pair under the Lustre file system and only creates a symbolic link to OpenZFS. The problem with this approach is that ZFS is the actual $HOME of the users, not Lustre. If you want to delete the cluster and keep the $HOME directories, those keys are deleted. If you want to give users access to 2 different cluster but keep the same $HOME, and then you want to delete the first cluster created, you can't because the keys are stored on the original Lustre.The behaviour should be:
.sshdirectory natively on Lustre. Generate the key pair..sshdirectory natively on ZFS and not Lustre.The issue can be found here.