Overlay filesystems, also known as overlayfs, are a type of union filesystem. They operate by layering two directories on a single Linux host and presenting them as a single directory. These directories are called layers, and the filesystem where they are merged is referred to as the merged directory. Overlay filesystems are a fundamental component in the world of containerization and orchestration, providing a lightweight, efficient mechanism to manage application dependencies and isolate the application from the host system.
Understanding overlay filesystems is crucial for software engineers working with containerized applications and orchestration tools like Kubernetes. This glossary entry will delve into the intricate details of overlay filesystems, their history, how they work, their role in containerization and orchestration, and specific examples of their usage.
Definition of Overlay Filesystems
An overlay filesystem is a filesystem service for Linux that implements a union mount for other file systems. It allows the user to overlay one filesystem on top of another. Changes are recorded in the upper filesystem, while the lower filesystem remains unaltered. This provides a fully functional filesystem without requiring a large amount of storage space.
The overlay filesystem consists of two directories, or layers: the lower (or bottom) layer and the upper (or top) layer. The lower layer is typically read-only, while the upper layer is read-write. Any changes made are written to the upper layer, preserving the integrity of the lower layer. The combined view of both layers is presented as a merged directory.
Lower Layer
The lower layer in an overlay filesystem is the base layer. It contains the original data, which is typically read-only. This layer can be a single directory or a stack of directories unified into one. The lower layer provides the base filesystem image for a container. In the context of Docker, for instance, the lower layer often contains a base image like Ubuntu or Alpine.
When a process requests to read a file, the overlay filesystem first checks the upper layer. If the file doesn't exist there, it checks the lower layer. This way, the lower layer's data appears as part of the upper layer's filesystem, even though it remains unaltered.
Upper Layer
The upper layer in an overlay filesystem is where changes are made. This layer is writable and contains all changes to the data, including file modifications, deletions, and creations. When a file is modified, the overlay filesystem uses a technique called copy-on-write (CoW) to copy the file from the lower layer to the upper layer before making the changes.
Any changes made in the upper layer only affect that layer; the lower layer remains unchanged. This separation is what allows overlay filesystems to provide a unified view of the filesystem without modifying the original data.
History of Overlay Filesystems
The concept of overlay filesystems has been around for several decades, but the implementation in Linux, known as overlayfs, was not merged into the Linux kernel until version 3.18 in December 2014. The development of overlayfs was driven by the need for a lightweight, efficient filesystem for use with containers.
Before the introduction of overlayfs, union filesystems like AUFS were used. However, these filesystems were complex and not part of the mainline Linux kernel. Overlayfs was designed to be simpler and more efficient than these earlier union filesystems.
Development of OverlayFS
OverlayFS was developed by Miklos Szeredi, a renowned Linux kernel developer. It was initially released as a standalone filesystem in 2010, but it wasn't merged into the Linux kernel until 2014. The delay was due to the complexity of integrating a new filesystem into the kernel and the need for extensive testing and review.
Since its integration into the Linux kernel, OverlayFS has become the default filesystem for many container runtime environments, including Docker. Its simplicity, efficiency, and integration with the Linux kernel make it an ideal choice for containerized applications.
Adoption by Container Technologies
Container technologies like Docker and Kubernetes have adopted overlay filesystems as the default choice for managing container images. Overlay filesystems provide a lightweight, efficient way to manage the filesystem layers that make up a container image.
For instance, Docker uses overlay2, a more advanced version of overlayfs, as its default storage driver. When Docker pulls an image, it downloads each layer of the image as a separate overlay filesystem. This allows Docker to cache shared layers between containers, reducing storage space and improving performance.
Overlay Filesystems in Containerization
Overlay filesystems play a crucial role in containerization. They provide the mechanism for creating and managing the multiple layers that make up a container image. Each layer in a container image corresponds to a layer in an overlay filesystem.
When a container is launched from an image, a new overlay filesystem is created with the image layers as the lower, read-only layers, and a new, empty layer as the upper, writable layer. Any changes made inside the container are written to this upper layer, leaving the image layers unchanged.
Image Layers
Container images are made up of layers. Each layer corresponds to a set of changes, or a diff, in the filesystem. When an image is built, each instruction in the Dockerfile creates a new layer in the image. These layers are stacked on top of each other to form the final image.
Overlay filesystems manage these layers. Each layer in the image is a separate overlay filesystem, with its own lower and upper layers. When the image is pulled, the layers are downloaded separately and stacked on top of each other to form the final filesystem.
Container Layers
When a container is launched from an image, a new overlay filesystem is created for the container. The image layers form the lower, read-only layers of the overlay filesystem, and a new, empty layer is created as the upper, writable layer. Any changes made inside the container are written to this upper layer.
This separation of layers allows containers to share image layers, reducing storage space and improving performance. It also provides isolation between containers, as changes made in one container do not affect other containers.
Overlay Filesystems in Orchestration
In orchestration, overlay filesystems provide the mechanism for distributing and managing container images across multiple hosts. Orchestration tools like Kubernetes use overlay filesystems to pull image layers from a registry and assemble them into a complete filesystem for each container.
Overlay filesystems also provide the isolation needed for multi-tenant environments. Each container has its own overlay filesystem, ensuring that changes made in one container do not affect other containers. This isolation is crucial for the security and stability of orchestrated applications.
Distribution of Image Layers
Orchestration tools use overlay filesystems to distribute image layers across multiple hosts. When a new container is scheduled, the orchestration tool pulls the necessary image layers from a registry and assembles them into an overlay filesystem for the container.
This distribution of image layers allows for efficient use of storage and network resources. Only the necessary layers are pulled, and shared layers are cached across multiple containers. This reduces the amount of data that needs to be transferred and stored, improving performance and reducing costs.
Isolation in Multi-Tenant Environments
Overlay filesystems provide the isolation needed for multi-tenant environments. Each container has its own overlay filesystem, ensuring that changes made in one container do not affect other containers. This isolation is crucial for the security and stability of orchestrated applications.
For instance, in a Kubernetes cluster, each pod has its own overlay filesystem. This ensures that each pod has its own isolated filesystem, preventing interference between pods and providing a secure environment for each application.
Examples of Overlay Filesystems Usage
Overlay filesystems are used extensively in container technologies like Docker and Kubernetes. Here are some specific examples of how overlay filesystems are used in these technologies.
Docker
Docker uses overlay2, a more advanced version of overlayfs, as its default storage driver. When Docker pulls an image, it downloads each layer of the image as a separate overlay filesystem. This allows Docker to cache shared layers between containers, reducing storage space and improving performance.
When a container is launched, Docker creates a new overlay filesystem with the image layers as the lower, read-only layers, and a new, empty layer as the upper, writable layer. Any changes made inside the container are written to this upper layer, leaving the image layers unchanged.
Kubernetes
Kubernetes uses overlay filesystems to manage container images and provide isolation between pods. When a new pod is scheduled, Kubernetes pulls the necessary image layers from a registry and assembles them into an overlay filesystem for each container in the pod.
Each pod in a Kubernetes cluster has its own overlay filesystem, ensuring that each pod has its own isolated filesystem. This prevents interference between pods and provides a secure environment for each application.
Conclusion
Overlay filesystems are a fundamental component in the world of containerization and orchestration. They provide a lightweight, efficient mechanism for managing application dependencies, isolating applications from the host system, and distributing applications across multiple hosts.
Understanding overlay filesystems is crucial for software engineers working with containerized applications and orchestration tools. As container technologies continue to evolve, the role of overlay filesystems is likely to become even more important.