Run a container

Keep building and running containers until we get a cathedral.

What is a container?

A container is a set of processes just like other processes you launched from the shell, except that it's being isolated in its namespace, cgroups, and union filesystem. It has everything it needs in its isolation: code, runtime, system tools, system libraries, settings and so on.

Docker is the dominant container option. However, there're various competitors such as CoreOS rkt, Ubuntu LXD. People standardize the container into the OCI specs and love to have various implementations.

Who needs a container?

Almost everyone.

Why need a container?

  • Container isolates physical resources such as CPU, memory, disk I/O and network from other containers.
  • Container isolates OS kernel resources such process id, mount points, user and group IDs from other containers.
  • Containers eliminates differences between development and staging environments and help reduce conflicts between teams running different software on the same infrastructure.

How does the container work?

Want to see more posts? Follow @enqueuezero in Twitter.

The way we start running a container can be explained in below bash code.

# Prepare a hash. We need it to identify our container.
$ uuid="ps_$(shuf -i 42002-42254 -n 1)"

# Prepare a root dir for all the containers.
$ btrfs_path='/var/bocker' && cgroups='cpu,cpuacct,memory';

# Prepare root filesystem based on the given `$image`.
$ btrfs subvolume snapshot "$btrfs_path/$image" "$btrfs_path/$uuid" > /dev/null

# Create a cgroup
$ cgcreate -g "$cgroups:/$uuid"

# Control cgroup resource
$ cgset -r cpu.shares=512 "$uuid"
$ cgset -r memory.limit_in_bytes=512000000 "$uuid"

# Execute a given `$cmd` in the cgroup.
# We need to create a unique namespace for the command (unshare).
# We also need to change the root directory (chroot).
# We also need to mount the runtime (/proc).
# Logging is a bonus (tee).
$ cgexec -g "$cgroups:$uuid" \
        ip netns exec netns_"$uuid" \
        unshare -fmuip --mount-proc \
        chroot "$btrfs_path/$uuid" \
        /bin/sh -c "/bin/mount -t proc proc /proc && $cmd" \
        2>&1 | tee "$btrfs_path/$uuid/$uuid.log"

You don't necessarily need to remember all the commands above, since it's pointless if you aren't a container engine developer.

The container engine such as runC, rkt, lxc provides you a beautifully designed CLI that abstracts above process for you.

If you like reading Youtube video, Liz Rice just implemented the container from scratch in 40 minutes.

Note: Above code is extracted from awesome bocker.

Frequently Asked Questions

More questions? Ask me by opening an issue.

A container is one or more non-trivial Linux processes running on top of the Kernel. We isolate them via cgroups and namespace.

Virtual Machine is a set of processes with dependencies running on top of a guest OS kernel. The guest OS is pre-allocated with a fixed amount of CPU, memory on top of the hypervisor and host OS kernel.

So, with the container, you get less isolation but much lightweight. With the VM, you get more isolation but much heavier. (It's pretty straightforward, right? we need to spend extra memory for guest OS in the VM. Besides programs in different VMs don't share things at all and hence load more things into RAM.)

Some would even mix using VM and container.

You shouldn't switch to the containers if you are managing virtual machines, as Vagrant is a virtual machine manager.

You should switch to the containers if you merely want to run applications.

For those OS that doesn't support the container, you might want to create a virtual machine via Vagrant first, and then run applications as containers in the virtual machine.

See awesome explanation in Stackoverflow from the author of Docker

You can use docker cp foo.txt my container:/foo.txt to send file foo.txt from the host into the container.

You can use docker cp my container:/foo.txt foo.txt to send file foo.txt from the container to the host. [Note 1]: This is supported in Docker 1.8.

You can use COPY ./foo.txt /app to package file foo.txt into container image. However, this happens at the building stage.

You can attach a volume for the container when start running it, pretty much like docker run -v $(pwd):/app mycontainer. It allows file syncing via a shared directory.

You can expose a port for the container when start running it, pretty much like docker run -p 8000:8000 mycontainer. It allows other programs talking to mycontainer by simply establishing a TCP connection.

Some solutions below are not recommended due to making things complicated.

  • Running sshd in the container.
  • Running a static web server in the container.

If you're intended to communicate from one container to another container, then you need sort of container orchestration tools. Check docker-compose if your case is in development mode. Check Kubernetes if your case is in production mode.

The short answer is there is no such actual OS running but we provide files for the base OS.

Each command in Dockerfile creates a new layer for the image. Each layer includes merely some static files.

The base OS defined in Dockerfile ends with the necessary files for the designated OS to be packaged into a layer.

In the runtime, your process thinks itself running on a designated OS, however, it's an illusion. Your container really runs as one or more processes with a set of files from the filesystem generated by UnionFS.

The image is a binary packaged with files and organized by layers.

The container is a runtime instance of the image. You can have various containers for one image.

You can check image via docker images, and check containers via docker ps. The command docker run turns an image into a container.

The image v/s container is pretty much like the program v/s process.

Don't worry about that.

It generally wouldn't.

Best Practise

Want to contribute? Simply make a pull request.

Don’t start with a full OS as the base image if you don't need, instead, build the image from a small base OS such as alpine.

Declare unnecessary files in .dockerignore.

Use multistage builds.

# Start building from a base for building
FROM python:3.7 as base
ADD requirements.txt /requirements.txt
RUN pip install -r /requirements.txt

# Then, we start from a new base.
FROM python:3.7-alpine

# Finally, we copy things from previous stage into new base.
COPY --from=base /usr/local /user/local

CMD /usr/local/bin/myapp

Use chaining commands to reduce image layers. And don't forget to do clean up works.

# Not recommend.
RUN apk add packageA
RUN apk add packageB
RUN make
RUN makeinstall

# Recommend
RUN apk add --no-cache packageA packageB \
    && make \
    && make install

Check this topic in the awesome Container Best Practises.

It’s okay to read & write files in the container for temporarily data processing. However, you should aware that any data in the container would be lost when it gets killed.

Attach volumes into the container whenever you want to persist data.

If the transaction is required, please, connect the container to a SQL database container. In this case, the SQL database container should attach volumes for persisting data and expose a port.

The safest bet is to use your homemade Docker images or by using verified images, whenever possible.

Otherwise, some malicious bots might hack into your container cluster. Check such reports.

Cheatsheet

Want to see more similar crafts? Consider donating via Patreon.

Docker Build

# Build an image from current directory, setting tag as $username/$reponame:$version
# Don't forget the `.` at last!
$ docker build -t soasme/vanilla:1.0.0 .

# Let's check we have built
$ docker images

# In case the image is not longer needed, you might want to delete it.
$ docker rmi soasme/vanilla:1.0.0

Docker Registry

# Pull an image
$ docker pull soasme/tiddlywiki

# Login to remote registry
$ docker login

# Push an image
$ docker push soasme/vanilla:1.0.0

Docker Run

# Run a container in interactive mode, with port 8080 exposed, and dir `pwd` attached.
$ docker run -it --rm -p 8080:8080 -v `pwd`:/data soasme/tiddlywiki

# List running instances
$ docker ps

# Attach into the container. Get the hash from `docker ps` command.
$ docker exec -it d252fb6c5d7a /bin/sh

# See latest logs. Get the hash from `docker ps` command.
$ docker logs --tail 100 d252fb6c5d7a

# Stop a container. Get the hash from `docker ps` command.
$ docker kill d252fb6c5d7a

Fun facts

Want to discuss the container technology? Join Telegram group @enqueuezero.

Do you know these facts?

  • Linux Kernel knows nothing about the container currently. All it knows are cgroups, processes, namespaces and so on.
  • Docker is not a shiny new technology. All the fundamental tools have been developed and improved for years. Docker implements a high-level API and calls the abstraction as the CONTAINER.
  • Docker used LXC as container engine but shifted to containerd & runC & libcontainer later.
  • Docker Inc had a tough year in 2017. Docker-swarm didn't win the battle of THE container orchestration tool. Plus, not everyone understood the rollout of Moby.
  • With the wide adoption of the container, another battle for a better container orchestration tool was going on. It seems that Kubernetes has won the game. Will there be a better orchestration tool like Nginx over Apache?