Skip to content

Crash under IO load when using both encryption and deduplication #12931

@gpakosz

Description

@gpakosz

System information

Type Version/Name
Distribution Name Debian
Distribution Version Buster
Kernel Version 4.19.0-18-amd64 and 5.10.0-0.bpo.9-amd64
Architecture 64bit
OpenZFS Version 0.8.5-3~bpo10+1 and 2.0.3-9~bpo10+1

Describe the problem you're observing

Crash under IO load with a dataset for which both encryption and deduplication are used.

Describe how to reproduce the problem

We don't have a simple repro (yet).

  • We start with the empty /somewhere/data dataset for which both deduplication and encryption is enabled
  • We populate /somewhere/data/_files with various data files (JSON, images and proprietary binary files)
    • There are 16427 files in there
  • To engage a bit of deduplication we cp -pR /somewhere/data/_files /somewhere/data/_files-copy
  • Then we launch a Python multiprocess tool that does the following for each worker:
    • Create a temporary directory within /somewhere/data/_tmp with tempfile.TemporaryDirectory
    • Copy files from /somewhere/data/_files to /somewhere/data/_tmp/in
      • Files in /somewhere/data/_tmp/in are in various sub-folders
    • Create the /somewhere/data/_tmp/intermediate1 and /somewhere/data/_tmp/intermediate2 directories
    • Spawn another tool programmed in C that does the following:
      • Copy some files from /somewhere/data/_tmp/in to /somewhere/data/_tmp/intermediate1
      • Create some files in /somewhere/data/_tmp/in and /somewhere/data/_tmp/intermediate2
      • Create files in /somewhere/data/_files /somewhere/data/_out

That Python multiprocess tool doesn't run long before ZFS crashes. There are at most 12 workers and we managed to crash the filesystem with 4 workers.

We ran that second native tool through valgrind and found no memory bug. We doubt a user land program can crash ZFS anyway.

When the crash happens,

  • The DDT is using about 10GB of memory (in ARC)
  • The ARC cache is about 100GB (its optimal size). According to our graphs
  • Wwe are not running out of memory (192GB in total)

We're able to reproduce the crash on a fresh and smaller pool on the same machine with both kernel versions.

Kernel version ZFS version Configuration Result
4.19.0-18-amd64 0.8.5-3~bpo10+1 encryption on / deduplication on 💥
4.19.0-18-amd64 2.0.3-9~bpo10+1 encryption on / deduplication on 💥
4.19.0-18-amd64 2.0.3-9~bpo10+1 encryption off / deduplication off
4.19.0-18-amd64 2.0.3-9~bpo10+1 encryption on / deduplication off
4.19.0-18-amd64 2.0.3-9~bpo10+1 encryption off / deduplication on
4.19.0-18-amd64 2.0.3-9~bpo10+1 encryption on / deduplication on 💥
5.10.0-0.bpo.9-amd64 2.0.3-9~bpo10+1 encryption on / deduplication on 💥
5.10.0-0.bpo.9-amd64 2.0.3-9~bpo10+1 encryption on / deduplication off

Include any warning/errors/backtraces from the system logs

Kernel 4.19.0-18-amd64:

# dmesg -T
...
[Fri Dec 31 12:57:55 2021] BUG: unable to handle kernel paging request at ffffae7de8a00000
[Fri Dec 31 12:57:55 2021] PGD 2fbf529067 P4D 2fbf529067 PUD 2f77a0a067 PMD 2e94fa6067 PTE 0
[Fri Dec 31 12:57:55 2021] Oops: 0000 [#1] SMP PTI
[Fri Dec 31 12:57:55 2021] CPU: 3 PID: 1195 Comm: z_wr_iss Tainted: P           OE     4.19.0-18-amd64 #1 Debian 4.19.208-1
[Fri Dec 31 12:57:55 2021] Hardware name: Dell Inc. PowerEdge T630/0W9WXC, BIOS 2.11.0 12/23/2019
[Fri Dec 31 12:57:55 2021] RIP: 0010:SHA512TransformBlocks+0x83/0x1340 [icp]
[Fri Dec 31 12:57:55 2021] Code: 48 8b 4f 10 48 8b 57 18 4c 8b 47 20 4c 8b 4f 28 4c 8b 57 30 4c 8b 5f 38 eb 0d 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 31 ff <4c> 8b 26 49 0f cc 4d 89 c5 4d 89 c6 4d 89 cf 49
 c1 cd 0e 49 c1 ce
[Fri Dec 31 12:57:55 2021] RSP: 0018:ffffae7da03e3680 EFLAGS: 00010246
[Fri Dec 31 12:57:55 2021] RAX: c31b301a02095284 RBX: c2c0fb90c3a277f3 RCX: 305038bc0a334cbd
[Fri Dec 31 12:57:55 2021] RDX: 14742eeb7f10abd8 RSI: ffffae7de8a00000 RDI: 0000000000000000
[Fri Dec 31 12:57:55 2021] RBP: ffffffffc0d51140 R08: b4b1ad3f75e8aa89 R09: 7ec5620da1b0de77
[Fri Dec 31 12:57:55 2021] R10: 59f9a675d9df9382 R11: 864d3213f20a28e4 R12: 8fa555c4bb64780e
[Fri Dec 31 12:57:55 2021] R13: de32ad31e7eb88a1 R14: 37d450af0dd6dceb R15: 01045023014618c3
[Fri Dec 31 12:57:55 2021] FS:  0000000000000000(0000) GS:ffff9c2dbf8c0000(0000) knlGS:0000000000000000
[Fri Dec 31 12:57:55 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Dec 31 12:57:55 2021] CR2: ffffae7de8a00000 CR3: 0000001ccea0a002 CR4: 00000000001606e0
[Fri Dec 31 12:57:55 2021] Call Trace:
[Fri Dec 31 12:57:55 2021]  ? swiotlb_map_sg_attrs+0x4c/0x120
[Fri Dec 31 12:57:55 2021]  ? SHA2Update+0x19f/0x1e0 [icp]
[Fri Dec 31 12:57:55 2021]  ? sha2_mac_atomic+0x213/0x320 [icp]
[Fri Dec 31 12:57:55 2021]  ? blk_mq_sched_dispatch_requests+0x11e/0x170
[Fri Dec 31 12:57:55 2021]  ? _cond_resched+0x15/0x30
[Fri Dec 31 12:57:55 2021]  ? mutex_lock+0xe/0x30
[Fri Dec 31 12:57:55 2021]  ? crypto_mac+0x292/0x390 [icp]
[Fri Dec 31 12:57:55 2021]  ? i_mod_hash_find_nosync+0x43/0x90 [icp]
[Fri Dec 31 12:57:55 2021]  ? mod_hash_idkey_cmp+0x10/0x10 [icp]
[Fri Dec 31 12:57:55 2021]  ? i_mod_hash_find_nosync+0x62/0x90 [icp]
[Fri Dec 31 12:57:55 2021]  ? zio_crypt_do_hmac+0xc9/0x150 [zfs]
[Fri Dec 31 12:57:55 2021]  ? spa_crypto_key_compare+0x30/0x30 [zfs]
[Fri Dec 31 12:57:55 2021]  ? avl_find+0x58/0x90 [zavl]
[Fri Dec 31 12:57:55 2021]  ? zio_crypt_generate_iv_salt_dedup+0x2f/0x70 [zfs]
[Fri Dec 31 12:57:55 2021]  ? load_balance+0x193/0xa00
[Fri Dec 31 12:57:55 2021]  ? abd_borrow_buf+0x1f/0x40 [zfs]
[Fri Dec 31 12:57:55 2021]  ? abd_borrow_buf_copy+0x28/0x70 [zfs]
[Fri Dec 31 12:57:55 2021]  ? spa_do_crypt_abd+0x118/0x370 [zfs]
[Fri Dec 31 12:57:55 2021]  ? zio_encrypt+0x437/0x5f0 [zfs]
[Fri Dec 31 12:57:55 2021]  ? zio_write_compress+0x42/0x780 [zfs]
[Fri Dec 31 12:57:55 2021]  ? __schedule+0x2a7/0x840
[Fri Dec 31 12:57:55 2021]  ? __wake_up_common_lock+0x89/0xc0
[Fri Dec 31 12:57:55 2021]  ? zio_execute+0x90/0xf0 [zfs]
[Fri Dec 31 12:57:55 2021]  ? taskq_thread+0x2e7/0x530 [spl]
[Fri Dec 31 12:57:55 2021]  ? wake_up_q+0x70/0x70
[Fri Dec 31 12:57:55 2021]  ? zio_taskq_member.isra.10.constprop.16+0x60/0x60 [zfs]
[Fri Dec 31 12:57:55 2021]  ? taskq_thread_spawn+0x50/0x50 [spl]
[Fri Dec 31 12:57:55 2021]  ? kthread+0x112/0x130
[Fri Dec 31 12:57:55 2021]  ? kthread_bind+0x30/0x30
[Fri Dec 31 12:57:55 2021]  ? ret_from_fork+0x35/0x40
[Fri Dec 31 12:57:55 2021] Modules linked in: arc4 md4 sha512_ssse3 sha512_generic cmac nls_utf8 cifs ccm dns_resolver fscache dell_rbu binfmt_misc nls_ascii nls_cp437 vfat fat xfs libcrc32c intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass mxm_wmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel dcdbas intel_cstate mgag200 intel_uncore ttm efi_pstore intel_rapl_perf efivars pcspkr drm_kms_helper drm iTCO_wdt evdev joydev iTCO_vendor_support mei_me mei zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) ipmi_ssif zcommon(POE) znvpair(POE) spl(OE) sg ipmi_si ipmi_devintf ipmi_msghandler wmi pcc_cpufreq acpi_power_meter button efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb dm_mod hid_generic usbhid hid sd_mod crc32c_intel xhci_pci
[Fri Dec 31 12:57:55 2021]  ahci xhci_hcd libahci aesni_intel ehci_pci libata ehci_hcd aes_x86_64 crypto_simd cryptd megaraid_sas glue_helper usbcore igb scsi_mod lpc_ich mfd_core i2c_algo_bit dca usb_common
[Fri Dec 31 12:57:55 2021] CR2: ffffae7de8a00000
[Fri Dec 31 12:57:55 2021] ---[ end trace 26c4fbf30e1b8ea1 ]---
[Fri Dec 31 12:57:55 2021] RIP: 0010:SHA512TransformBlocks+0x83/0x1340 [icp]
[Fri Dec 31 12:57:55 2021] Code: 48 8b 4f 10 48 8b 57 18 4c 8b 47 20 4c 8b 4f 28 4c 8b 57 30 4c 8b 5f 38 eb 0d 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 31 ff <4c> 8b 26 49 0f cc 4d 89 c5 4d 89 c6 4d 89 cf 49 c1 cd 0e 49 c1 ce
[Fri Dec 31 12:57:55 2021] RSP: 0018:ffffae7da03e3680 EFLAGS: 00010246
[Fri Dec 31 12:57:55 2021] RAX: c31b301a02095284 RBX: c2c0fb90c3a277f3 RCX: 305038bc0a334cbd
[Fri Dec 31 12:57:55 2021] RDX: 14742eeb7f10abd8 RSI: ffffae7de8a00000 RDI: 0000000000000000
[Fri Dec 31 12:57:55 2021] RBP: ffffffffc0d51140 R08: b4b1ad3f75e8aa89 R09: 7ec5620da1b0de77
[Fri Dec 31 12:57:55 2021] R10: 59f9a675d9df9382 R11: 864d3213f20a28e4 R12: 8fa555c4bb64780e
[Fri Dec 31 12:57:55 2021] R13: de32ad31e7eb88a1 R14: 37d450af0dd6dceb R15: 01045023014618c3
[Fri Dec 31 12:57:55 2021] FS:  0000000000000000(0000) GS:ffff9c2dbf8c0000(0000) knlGS:0000000000000000
[Fri Dec 31 12:57:55 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Dec 31 12:57:55 2021] CR2: ffffae7de8a00000 CR3: 0000001ccea0a002 CR4: 00000000001606e0

Kernel 5.10.0-0.bpo.9-amd64

# dmesg -T
...
[Fri Dec 31 17:17:53 2021] BUG: unable to handle page fault for address: ffffa227abbb8000
[Fri Dec 31 17:17:53 2021] #PF: supervisor read access in kernel mode
[Fri Dec 31 17:17:53 2021] #PF: error_code(0x0000) - not-present page
[Fri Dec 31 17:17:53 2021] PGD 100000067 P4D 100000067 PUD 3e5956067 PMD ab96c4067 PTE 0
[Fri Dec 31 17:17:53 2021] Oops: 0000 [#1] SMP PTI
[Fri Dec 31 17:17:53 2021] CPU: 2 PID: 1173 Comm: z_wr_iss Tainted: P           OE     5.10.0-0.bpo.9-amd64 #1 Debian 5.10.70-1~bpo10+1
[Fri Dec 31 17:17:53 2021] Hardware name: Dell Inc. PowerEdge T630/0W9WXC, BIOS 2.11.0 12/23/2019
[Fri Dec 31 17:17:53 2021] RIP: 0010:SHA512TransformBlocks+0x83/0x1340 [icp]
[Fri Dec 31 17:17:53 2021] Code: 48 8b 4f 10 48 8b 57 18 4c 8b 47 20 4c 8b 4f 28 4c 8b 57 30 4c 8b 5f 38 eb 0d 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 31 ff <4c> 8b 26 49 0f cc 4d 89 c5 4d 89 c6 4d 89 cf 49 c1 cd 0e 49 c1 ce
[Fri Dec 31 17:17:53 2021] RSP: 0018:ffffa2274489b680 EFLAGS: 00010246
[Fri Dec 31 17:17:53 2021] RAX: 63db297cd38fdb11 RBX: a49ea88c578a8401 RCX: e602f3b255d62954
[Fri Dec 31 17:17:53 2021] RDX: 6bed86cd681311b0 RSI: ffffa227abbb8000 RDI: 0000000000000000
[Fri Dec 31 17:17:53 2021] RBP: ffffffffc0fb7140 R08: 1ccdfbdba4713d98 R09: 43eb6b8080c52cf5
[Fri Dec 31 17:17:53 2021] R10: 868f90a009af8ef6 R11: b510ec77fbe15dfe R12: 18f6bee4b6c17038
[Fri Dec 31 17:17:53 2021] R13: ac1a3279de1c8752 R14: 18f32b4697183ceb R15: 08432106000034ca
[Fri Dec 31 17:17:53 2021] FS:  0000000000000000(0000) GS:ffff96d7bfa80000(0000) knlGS:0000000000000000
[Fri Dec 31 17:17:53 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Dec 31 17:17:53 2021] CR2: ffffa227abbb8000 CR3: 00000006cdf94003 CR4: 00000000001706e0
[Fri Dec 31 17:17:53 2021] Call Trace:
[Fri Dec 31 17:17:53 2021]  ? SHA2Update+0x19f/0x1e0 [icp]
[Fri Dec 31 17:17:53 2021]  ? sha2_mac_atomic+0x213/0x320 [icp]
[Fri Dec 31 17:17:53 2021]  ? zap_leaf_array_match+0x14e/0x1c0 [zfs]
[Fri Dec 31 17:17:53 2021]  ? _cond_resched+0x15/0x30
[Fri Dec 31 17:17:53 2021]  ? mutex_lock+0xe/0x30
[Fri Dec 31 17:17:53 2021]  ? crypto_mac+0x292/0x390 [icp]
[Fri Dec 31 17:17:53 2021]  ? prep_new_page+0x89/0x100
[Fri Dec 31 17:17:53 2021]  ? get_page_from_freelist+0x11eb/0x1330
[Fri Dec 31 17:17:53 2021]  ? mod_hash_idkey_cmp+0x10/0x10 [icp]
[Fri Dec 31 17:17:53 2021]  ? i_mod_hash_find_nosync+0x62/0x90 [icp]
[Fri Dec 31 17:17:53 2021]  ? zio_crypt_do_hmac+0xc9/0x150 [zfs]
[Fri Dec 31 17:17:53 2021]  ? _cond_resched+0x15/0x30
[Fri Dec 31 17:17:53 2021]  ? mutex_lock+0xe/0x30
[Fri Dec 31 17:17:53 2021]  ? spa_crypto_key_compare+0x30/0x30 [zfs]
[Fri Dec 31 17:17:53 2021]  ? avl_find+0x58/0x90 [zavl]
[Fri Dec 31 17:17:53 2021]  ? zio_crypt_generate_iv_salt_dedup+0x2f/0x70 [zfs]
[Fri Dec 31 17:17:53 2021]  ? abd_borrow_buf+0x1f/0x40 [zfs]
[Fri Dec 31 17:17:53 2021]  ? abd_borrow_buf_copy+0x28/0x70 [zfs]
[Fri Dec 31 17:17:53 2021]  ? spa_do_crypt_abd+0x118/0x370 [zfs]
[Fri Dec 31 17:17:53 2021]  ? zio_encrypt+0x437/0x5f0 [zfs]
[Fri Dec 31 17:17:53 2021]  ? zio_write_compress+0x42/0x780 [zfs]
[Fri Dec 31 17:17:53 2021]  ? __wake_up_common_lock+0x87/0xc0
[Fri Dec 31 17:17:53 2021]  ? zio_execute+0x90/0xf0 [zfs]
[Fri Dec 31 17:17:53 2021]  ? taskq_thread+0x2e7/0x530 [spl]
[Fri Dec 31 17:17:53 2021]  ? wake_up_q+0xa0/0xa0
[Fri Dec 31 17:17:53 2021]  ? zio_execute+0x90/0xf0 [zfs]
[Fri Dec 31 17:17:53 2021]  ? taskq_thread+0x2e7/0x530 [spl]
[Fri Dec 31 17:17:53 2021]  ? wake_up_q+0xa0/0xa0
[Fri Dec 31 17:17:53 2021]  ? zio_taskq_member.isra.11.constprop.17+0x60/0x60 [zfs]
[Fri Dec 31 17:17:53 2021]  ? taskq_thread_spawn+0x50/0x50 [spl]
[Fri Dec 31 17:17:53 2021]  ? kthread+0x116/0x130
[Fri Dec 31 17:17:53 2021]  ? __kthread_cancel_work+0x40/0x40
[Fri Dec 31 17:17:53 2021]  ? ret_from_fork+0x22/0x30
[Fri Dec 31 17:17:53 2021] Modules linked in: md4 sha512_ssse3 sha512_generic cmac nls_utf8 cifs libarc4 dns_resolver fscache libdes dell_rbu binfmt_misc intel_rapl_msr intel_rapl_common xfs nls_ascii libcrc3
2c sb_edac nls_cp437 vfat x86_pkg_temp_thermal intel_powerclamp fat coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel libaes crypto_simd cryptd dcdbas glue_helper mgag200 rapl inte
l_cstate intel_uncore drm_kms_helper cec iTCO_wdt intel_pmc_bxt iTCO_vendor_support efi_pstore pcspkr mxm_wmi watchdog drm joydev evdev mei_me mei zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE)
zcommon(POE) znvpair(POE) spl(OE) ipmi_ssif sg ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter button efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic dm_mod hid_generic usbhid
 hid sd_mod t10_pi crc_t10dif crct10dif_generic xhci_pci ahci xhci_hcd libahci libata ehci_pci megaraid_sas ehci_hcd crct10dif_pclmul crct10dif_common crc32c_intel igb usbcore scsi_mod
[Fri Dec 31 17:17:53 2021]  i2c_algo_bit lpc_ich dca ptp usb_common pps_core wmi
[Fri Dec 31 17:17:53 2021] CR2: ffffa227abbb8000
[Fri Dec 31 17:17:53 2021] ---[ end trace 28f3f76f9b905b05 ]---
[Fri Dec 31 17:17:53 2021] RIP: 0010:SHA512TransformBlocks+0x83/0x1340 [icp]
[Fri Dec 31 17:17:53 2021] Code: 48 8b 4f 10 48 8b 57 18 4c 8b 47 20 4c 8b 4f 28 4c 8b 57 30 4c 8b 5f 38 eb 0d 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 31 ff <4c> 8b 26 49 0f cc 4d 89 c5 4d 89 c6 4d 89 cf 49
 c1 cd 0e 49 c1 ce
[Fri Dec 31 17:17:53 2021] RSP: 0018:ffffa2274489b680 EFLAGS: 00010246
[Fri Dec 31 17:17:53 2021] RAX: 63db297cd38fdb11 RBX: a49ea88c578a8401 RCX: e602f3b255d62954
[Fri Dec 31 17:17:53 2021] RDX: 6bed86cd681311b0 RSI: ffffa227abbb8000 RDI: 0000000000000000
[Fri Dec 31 17:17:53 2021] RBP: ffffffffc0fb7140 R08: 1ccdfbdba4713d98 R09: 43eb6b8080c52cf5
[Fri Dec 31 17:17:53 2021] R10: 868f90a009af8ef6 R11: b510ec77fbe15dfe R12: 18f6bee4b6c17038
[Fri Dec 31 17:17:53 2021] R13: ac1a3279de1c8752 R14: 18f32b4697183ceb R15: 08432106000034ca
[Fri Dec 31 17:17:53 2021] FS:  0000000000000000(0000) GS:ffff96d7bfa80000(0000) knlGS:0000000000000000
[Fri Dec 31 17:17:53 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Dec 31 17:17:53 2021] CR2: ffffa227abbb8000 CR3: 00000006cdf94003 CR4: 00000000001706e0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Status: StaleNo recent activity for issueType: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions