The Kubernetes Containers runtime jungle

Dernière mise à jour : 8 sept. 2021

This blog post was originally published on Medium


Docker, Containerd, CRI-O, are all about containers but what are the differences, the pros and cons and the purpose of having several « docker implementations », are they all kubernetes compatible.


The multiplicity of container engines and tools like crictl, podman, buildha, skopeo, kaniko , jib, … around containers shows the need of evolution and adaptability in the kubernetes world.


The Cloud Native Computing Foundation (CNCF) that includes the Kubernetes project has created a landscape map with a container runtime section: https://landscape.cncf.io/images/landscape.png :


It is not an exhaustive list but it shows the growth of container runtime for Kubernetes.


Docker has been the de facto standard for a long time, originally aimed at extending the capabilities of Linux Containers (LXC), Docker was created as an open-source project in 2013 The company’s solution is now the leading software containerisation platform on the market. Using LXC, Docker acts as a portable container engine for packaging applications and dependencies into containers easily deployable on any system.


Before Docker 1.11 (May 2016), the Docker engine daemon was in charge of downloading container images, launching container processus, exposing an API to interact with the daemon, … and all within a single central process with root privileges.


This convenient approach is useful for deployment but having a monolithic container runtime that manages both images and containers doesn’t follow best practices regarding process separation and Unix privileges.


OCI: Open Container Initiative


Established in June 2015 by Docker and other leaders in the container industry (and donated to the Linux Foundation), the OCI currently contains two specifications:

  • Runtime Specification with runc as the reference implementation (also donated by Docker) It’s a standalone binary which takes an OCI container and runs it. To put it simply, runC is basically a little command-line tool to leverage libcontainer (a lib interfacing with Linux kernel facilities like cgroups and namespaces) directly, without going through the Docker Engine

  • Image Specification : The goal of this specification is to enable the creation of interoperable tools for building, transporting, and preparing a container image to run.

But runc is a low level container runtime without API. (like gVisor, Kata Containers, or Nabla Containers)


For Dev and Sys Admin we need a high level container runtime providing an API and a CLI allowing to interact easily with images: this is Docker


Containerd


But for breaking up more modularity to Docker architecture and more neutrality regarding the other industry actors Docker introduced Containerd that act as an API facade to containers runtime (runc in this case)


Containerd has a smaller scope than Docker (For instance it cannot build images), provides a client API, and is more focused on being embeddable.


To run a container, Docker engine creates the image, pass it to Containerd. Containerd calls « containerd-shim » that uses runC to run the container.


The shim allows for daemon-less containers. It basically sits as the parent of the container’s process to facilitate few things (like avoiding the long running runtime processes for containers).


So since Docker 1.11, docker is built upon runC and Containerd and so it was the first OCI compliant runtime release


Kubernetes


But it was not enough for Kubernetes.


Each container runtime has its own strengths, and it was mandatory for Kubernetes to support more runtimes.


It is why Kubernetes 1.5 introduced the Container Runtime Interface (CRI) which enables kubelet (the kubernetes component installed on each worker node and in charge of the container lifecycle) to use a wide variety of containers runtime, without the need to recompile.


Supporting interchangeable containers runtime was not a new concept in Kubernetes but at the beginning both Docker and rkt were integrated directly and deeply into the kubelet source code through an internal and volatile interface.


Kubelet communicates with the container runtime (or a CRI shim for the runtime) over Unix sockets using the gRPC framework, where kubelet acts as a client and the CRI shim as the server.


The protocol buffers API includes two gRPC services, ImageService, and RuntimeService. The ImageService provides RPCs to pull an image from a repository, inspect, and remove an image. The RuntimeService contains RPCs to manage the lifecycle of the pods and containers, as well as calls to interact with containers (exec/attach/port-forward). A monolithic container runtime that manages both images and containers (e.g., Docker and rkt) can provide both services simultaneously with a single socket.


image from https://kubernetes.io