Docker containers are ephemeral workloads. Whatever data you store on your container filesystem gets wiped out once the container is gone. The data lives on a disk during the container’s life cycle but does not persist beyond it. Pragmatically speaking, most applications in the real world are stateful. They need to store data beyond the container life cycle and want it to persist.
So, how do we go along with that? Docker provides several ways you can store data. By default, all data is stored on the writable container layer, which is ephemeral. The writable container layer interacts with the host filesystem via a storage driver. Because of the abstraction, writing files to the container layer is slower than writing directly to the host filesystem.
To solve that problem and also provide persistent storage, Docker provides volumes, bind mounts, and tmpfs. With them, you can interact directly with the host filesystem (and memory in the case of tmpfs) and save a ton of I/O operations per second (IOPS), improving performance. While this section focuses on storage drivers that cater to the container filesystem, it is worth discussing multiple data storage options within Docker to provide a background.
Docker data storage options
Every option has a use case and trade-off. Let’s look at each option and where you should use which.
Volumes
Docker volumes store the data directly in the host’s filesystem. They do not use the storage driver layer in between, so writing to volumes is faster. They are the best way to persist data. Docker stores volumes in /var/lib/docker/volumes and assumes that no one apart from the Docker daemon can modify the data on them.
As a result, volumes provide the following features:
- Provide some isolation with the host filesystems. If you don’t want other processes to interact with the data, then a volume should be your choice.
- You can share a volume with multiple containers.
- Volumes can either be named or anonymous. Docker stores anonymous volumes in a directory with a unique random name.
- Volumes enable you to store data remotely or in a cloud provider using volume drivers. This helps a lot if multiple containers share the same volume to provide a multi-instance active-active configuration.
- The data in the volume persists even when the containers are deleted.
Now, let’s look at another storage option – bind mounts.
Bind mounts
Bind mounts are very similar to volumes but with a significant difference: they allow you to mount an existing host directory as a filesystem on the container. This lets you share important files with the Docker container, such as /etc/resolv.conf.
Bind mounts also allow multiple processes to modify data along with Docker. So, if you are sharing your container data with another application that is not running in Docker, bind mounts are the way to go.
tmpfs mounts
tmpfs mounts store data in memory. They do not store any data on disk – neither the container nor the host filesystem. You can use them to store sensitive information and the non-persistent state during the lifetime of your container.