disk-deactivate: wipe raid arrays before stopping them#1248
disk-deactivate: wipe raid arrays before stopping them#1248kmein wants to merge 2 commits intonix-community:masterfrom
Conversation
Previously, the `disk-deactivate` script stopped mdadm arrays before wiping the underlying physical disks. This caused filesystems with backup superblocks (like BTRFS at 64 MiB and 256 GiB offsets) to survive the wipe, as the backups were striped across the physical disks and missed by `wipefs` on the raw block devices. When the array was reassembled during reprovisioning, the filesystem superblocks realigned. `disko` would detect the old filesystem via `blkid` and silently skip the `mkfs` step, leaving stale data intact. This commit adds a `wipefs --all` command against the assembled RAID device *before* stopping it, ensuring all filesystem signatures and backup superblocks are cleanly destroyed.
There was a problem hiding this comment.
Pull request overview
This PR addresses #1247 by ensuring disk-deactivate wipes filesystem/partition signatures on active mdadm RAID devices before stopping the array, preventing stale on-array signatures (e.g., btrfs backup superblocks) from surviving teardown and causing later reprovisioning to skip mkfs.
Changes:
- For
lsblkdevices with types containing"raid", runwipefs --allon the RAID device beforemdadm --stop.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| elif (.type | contains("raid")) then | ||
| [ | ||
| "wipefs --all -f \(.path | shellquote)", | ||
| "mdadm --stop \(.name | shellquote)" |
There was a problem hiding this comment.
mdadm --stop is invoked with .name, but lsblk’s name field is typically not an absolute device path. This can make the stop command fail (and it also differs from other places in the repo that stop arrays via /dev/md/<name>). Prefer stopping the array via .path (or otherwise ensure an absolute /dev/... path is passed).
| "mdadm --stop \(.name | shellquote)" | |
| "mdadm --stop \(.path | shellquote)" |
| elif (.type | contains("raid")) then | ||
| [ | ||
| "wipefs --all -f \(.path | shellquote)", | ||
| "mdadm --stop \(.name | shellquote)" |
There was a problem hiding this comment.
This change fixes a subtle teardown ordering bug (wipe md device signatures before stopping the array), but there’s no automated regression test ensuring destroyFormatMount actually forces a re-mkfs for filesystems on mdadm arrays (especially btrfs with backup superblocks). Consider adding a NixOS test that provisions btrfs-on-mdadm, runs destroy+recreate, and asserts the old btrfs signature/data is gone (i.e., mkfs is not skipped).
There was a problem hiding this comment.
I've added a NixOS test that ensures the expected behaviour. It does not use diskoLib.testLib.makeDiskoTest because that function does not test the unhappy-path possibility of files being left over from previous iterations.
Adds a NixOS integration test to verify that the `disk-deactivate` script properly destroys BTRFS filesystems residing on mdadm RAID arrays.
Closes #1247
How to test
cd "$(mktemp -d)"wget https://gist.githubusercontent.com/kmein/9644e65ab8218a4b57c35cb9290b86ad/raw/3ec76f6a142abb1f4ea3764563a8843984b300d4/flake.nixnix build(output: Exception: The canary file survived the Disko wipe process!)sed -i 's#github:nix-community/disko#github:kmein/disko?ref=fix/mdadm-destroy#' flake.nixnix build