AM1 - 10. lecture

What are the main differences between virtual machines and containers?

described in Rozdíl proti virtuálním strojům

Container engine

= Docker engine
manages, builds and runs Docker containers
interacts with user (CLI or API)
- also translates high-level commands for the containerd

Container runtime

= containerd [container-dee] and runc
2 parts:
- high-level container runtime (containerd)
  - long-running deamon process, handles the full lifecycle of the containers
- low-level container runtime (runc)
  - e.g. when the container is created, runc communicates with OS kernel to create a separate process for the container
does abstraction from syscalls or OS in general

How do these interact?

user enters docker run -d nginx to Docker CLI
Docker CLI sends the run command to the Docker Daemon (= container engine)
Docker Daemon validates the request and prepares the environment (checking if nginx is locally or pulling it from the registry) and then instructs the containerd (= container runtime) with creating the container
- containerd-shim component is responsible for keeping the container process running (even when the container runtime restarts)
- runc uses the OS and Linux namespaces to actually set-up the container and then exits (the main process for the container is containerd-shim, which monitors it and reports back to higher levels)

Docker terminology

image
- contains a union of layered filesystems stacked on top of each other
- it’s immutable, can be distributed, is used to create and run the container at some device
container
- Kontejner
registry
- Registry kontejnerů
Swarm
- Orchestrace kontejnerů

Linux namespaces

every Linux process belongs to some Linux namespace
- the same as namespace isolates packages in a program, Linux namespace isolates processes in Linux
- the reasons are primarily for security and isolation
connection between namespaces and containers
- containers are a form of virtualization (they need to be isolated) and the processes “inside” shouldn’t know about other processes
- with creation of a new container, a new namespace for it is created
  - it has it’s own mounted “filesystem”, it’s own hostname, PID sequences, users etc.
there are 7 namespaces:
- Mount (mnt)
  - isolates filesystem mount points
  - a process in one MNT namespace sees a unique set of mount points (regardless of what is mounted on the host)
- UTS (uts)
  - isolates value of hostname
- IPC (ipc)
  - isolates communication between processes (message queues, semaphores, shared memory)
  - so different processes cannot communicate via shared memory or message queues (unless explicitly allowed)
- PID (pid)
  - isolates PID number space
  - two processes in different PID namespace can have the same ID and there is no collision
- Network (net)
  - isolated network resources (like network interfaces, routing tables, IP addresses etc.)
  - each container gets its own virtual eth0 Ethernet card and its own IP address
- User (user)
  - isolates user and group IDs (UID, GID) between processes
  - use case: a process runs as root inside a container but as a regular process on the host
- Cgroup (cgroup)
  - for limiting and measuring the process resource usage (CPU, memory, I/O etc.)
  - each process only sees it’s usage, not the usage of all processes
    - so it cannot interfere with resources, it does not technically have
  - kernel has tools to limit the resources

OverlayFS and image layering

each instruction in Dockerfile “adds” one read-only layer, so there could be a lot of layers in the image/container
OverlayFS mechanism creates a unified “merged” view on all the layers - so the processes in the container see it as one writable layer and can work with it
- it uses the “copy on write” mechanism in the background
- reading files - OverlayFS first looks on the top (writable, merged) layer and then propagates to lower layers and returns the first file found (so higher layers can “overshine” the lower ones)
- writing files - OverlayFS looks for the file, if it is in lower image layers, it get copied into the common writable layer and then it is modified (the original file remains untouched)
- deleting files - OverlayFS looks for the file and then it creates a special “whiteout” file in the upper writable layer, signaling that this file is “deleted”
  - the original file remains untouched
why?
- storage efficiency - multiple containers could use the same image base (as there are only read-only files)
  - and each container has it’s own “upper writable” FS layer
- images are immutable and we can rely on that
- speed - newly created containers create only one empty upper writable layer (and that does not take much time)

Petrova digitální zahrada 🚀

Procházet

AM1 - 10. lecture

What are the main differences between virtual machines and containers?

Container engine

Container runtime

How do these interact?

Docker terminology

Linux namespaces

OverlayFS and image layering

Graf

Obsah

Příchozí odkazy

Petrova digitální zahrada 🚀

Procházet

AM1 - 10. lecture

Related notes

What are the main differences between virtual machines and containers?

Container engine

Container runtime

How do these interact?

Docker terminology

Linux namespaces

OverlayFS and image layering

Graf

Obsah

Příchozí odkazy