Docker: Checkpoint and Restore

Before I do a deeper dive into Firecracker, I wanted to run a quick experiment and check whether it’s possible to take a snapshot of a Docker container and restore it at some point. I’m far from being the first to think about this, and there’s a fantastic team working on CRIU: https://criu.org/. CRIU is a project that implements checkpoint/restore functionality for Linux. As the description says, it doesn’t work on Windows or Mac, so here I go again, dusting off my Thinkpad X220.

It’s been a while since I did something on Linux and Ubuntu, so I discovered there’s a new package manager called “Snap”. I tried it with Docker, and something messed up permissions on docker.sock file, i.e. whenever I start a Docker service, a sock file is always created under root:root. Perhaps, I’ll try snap next time, but for now I’m back to apt.

Install Docker

Install Docker by following instructions from https://docs.docker.com/engine/install/ubuntu/:

  1. Add Docker’s GPG key:
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
  1. Verify that you now have the key with the fingerprint 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88, by searching for the last 8 characters of the fingerprint:
$ sudo apt-key fingerprint 0EBFCD88

pub   rsa4096 2017-02-22 [SCEA]
      9DC8 5822 9FC7 DD38 854A  E2D8 8D81 803C 0EBF CD88
uid           [ unknown] Docker Release (CE deb) <docker@docker.com>
sub   rsa4096 2017-02-22 [S]
  1. Setup the stable repository:
$ sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
  1. Install Docker:
$ sudo apt-get install docker-ce docker-ce-cli containerd.io
  1. Add your user to docker group:
sudo usermod -aG docker ${USER}
  1. Logout/Login to your session.

  2. Validate that Docker runs:

$ docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

Install CRIU

Follow instruction from https://criu.org/Docker:

  1. Install CRIU:
$ sudo add-apt-repository ppa:criu/ppa
$ sudo apt install criu
  1. Enable experimental features in Docker:
sudo echo "{\"experimental\": true}" >> /etc/docker/daemon.json
sudo systemctl restart docker
  1. Run a test container that increments a timer every second:
$ docker run -d --name looper --security-opt seccomp:unconfined busybox  \
         /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
  1. Verify that logs are showing up:
$ docker logs looper
0
1
2
3
  1. Create a checkpoint, which will stop the container:
$ docker checkpoint create looper checkpoint1
checkpoint1
  1. Restore the container:
$ docker start --checkpoint checkpoint1 looper
  1. Check logs one more time:
$ docker logs looper
0
1
2
3
0
1
2

Something didn’t work well on the last step, as you may notice. Test container is starting the counter from 0 rather than from the checkpoint. Not sure why this is happening. I’m running Docker 20.10.3 and CRIU 3.15 on Linux 5.8.15, which satisfies CRIU requirements.

At the same time, if I try something different like run alpine container, create a folder and snapshot, it works like expected:

$ docker run -it alpine sh
/ # mkdir ~/hello
$ docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED         STATUS        PORTS     NAMES
a82ccad8fdfa   alpine    "sh"      2 minutes ago   Up 1 second             clever_kapitsa
$ docker checkpoint create a82ccad8fdfa checkpoint1
checkpoint1
$ docker ps
$ docker start --checkpoint checkpoint1 a82ccad8fdfa
$ docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED         STATUS        PORTS     NAMES
a82ccad8fdfa   alpine    "sh"      2 minutes ago   Up 1 second             clever_kapitsa
$ docker exec -it a82ccad8fdfa sh
/ # cd
~ # ls
hello

I think CRIU will be useful at somepoint with Firecracker as well. Resource-wise in some cases it doesn’t make sense to run user containers at all times if no one is using them.

← Joining the Recurse Center

Firecracker on Thinkpad X220 →