Skip to content

Multiple Child Processes Restoration Delay #7

@n-tk11

Description

@n-tk11

Hello,
Our team has been using FastFreeze and encountered a performance issue in PID controlling; we would like to share our fix here for future FastFreeze users. We also suggest that you update it to the README.

Issue Description
We've noticed that restoring applications with multiple child processes running in a Docker container takes an unusually long time.
We found that CRIU didn't sort restoration orders by PID, so it will fall into the set_ns_last_pid fork hack code.
Adding CAP_CHECKPOINT_RESTORE cannot solve the problem because Docker default security still prevents writing to files in the /proc filesystem.

To bypass this, you have to add:
docker run ... --security-opt systempaths=unconfined --security-opt apparmor=unconfined ...
Or use a custom AppArmor profile only to allow writing to the ns_last_pid file:

...snip...

deny @{PROC}/sys/kernel/{?,??,[^s][^h][^m]**}-@{PROC}/sys/kernel/ns_last_pid  w,  # deny everything except shm* and ns*(ns_last_pid) in /proc/sys/kernel/

...snip...

We've also written a blog post detailing our findings and the steps we took. You can find the blog post here: Link to Blog Post.
If you have any questions, please feel free to contact us:
sorawit.man@ku.th
thanawat.chanikaphon1@louisiana.edu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions