Title: ra_server_proc crash with "not owner" error on snapshot deletion after upgrade to 4.1.8 #626
Replies: 3 comments
-
|
Hi, I have never seen a not owner error here. I've got a feeling this is very environment specific. Did you change the rabbit user as part of the upgrade? This issue may be better raised as a discussion in the rabbitmq-server repo. Is this on windows? |
Beta Was this translation helpful? Give feedback.
-
|
Hi Regarding the Q) |
Beta Was this translation helpful? Give feedback.
-
|
We do not have any evidence of a Ra bug and RabbitMQ 4.1 and 3.13 are out of community support => this is discussion material. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Describe the bug
After upgrading from RabbitMQ 3.13.7 to 4.1.8 we are seeing repeated errors of the form:
This causes ra_server_proc to crash. The error only appears on nodes running 4.1.8 — nodes still on 3.13.7 are unaffected.
Reproduction steps
This happens after we upgraded to 4.1.8 version of rabbitmq
Expected behavior
Although it is not causing any harm, the file is eventually deleted when checked later. The error messages seem confusing.
Additional context
Looking at ra_lib:recursive_delete/1, is_dir/1 returns false for any error from prim_file:read_file_info/1—not just when the path is a regular file. When that happens, delete(Dir, regular) is called, which invokes unlink() on what may still be a directory. Both ext4 and NFSv4 return EPERM in this case.
is_dir/1 — https://github.com/rabbitmq/ra/blob/main/src/ra_lib.erl#L485
recursive_delete/1 — https://github.com/rabbitmq/ra/blob/main/src/ra_lib.erl#L197
We are not sure if this is the actual cause of what we are observing, but we wanted to highlight it in case it is useful.
Please feel free to close this, if it is not an issue. I am just reporting it only because I noticed it as a bug.
Beta Was this translation helpful? Give feedback.
All reactions