An overview of Linux container security

Dávid Osztertág (Software Engineer)

linux container security

Though containers are often treated as if they are virtual machines, this is far from the truth, as they are in fact much less isolated from the host system. However, there are a myriad of ways in which to enhance isolation. This blog post will provide an overview of Linux container security.

    What are containers?

    So what is the difference between containers and virtual machines? Containerization is operating-system-level virtualization where containers are isolated user-space instances that share the host system’s kernel. This means the kernel is the most vulnerable attack vector in your system. Virtual machines, on the other hand, are fully virtualized [HVM], including the kernel and hardware emulation. While containers have a huge advantage in terms of less overhead and better scalability, that doesn’t mean you have to give up on security.

      Get started with Linux container security

      Instead of running any processes as root in containers, you should use Linux capabilities and drop every capability you do not need. There are dangerous capabilities that should be avoided, such as SYS_ADMIN which essentially gives you root access. SETUID executables also pose a threat because they are often exploited to elevate privileges. If you need these capabilities you can still set the no-new-privileges security option which will prevent child processes from gaining additional privileges. Better yet, build container images that do not ship SUID binaries.

        User namespaces

        Linux containers are based on cgroups and namespaces such as the mount, pid, or network namespaces. A new addition is the user namespace which remaps a range of user IDs in a namespace to another range on the host. This means root in a Linux container (UID 0) will have an arbitrary high user ID on the host. User namespaces also make it possible to create containers as a regular user, without root privileges.

          SELinux / AppArmor

          Mandatory Access Control such as SELinux or AppArmor can help lock down containers even further. For example, with SELinux every container has its own multi-level label and they can only access their own resources and read /usr and /etc on the host if something is bind-mounted into the container. Docker has the z (for every container) and the Z (for a single container) volume mount flags for ease of use so you don’t have to relabel files manually.

            Restricting syscalls

            The Linux kernel has a feature called SECCOMP to restrict which syscalls can a process call to greatly reduce the attack surface on the kernel. However, determining all the syscalls needed by a complex application can be a lengthy process. Docker comes with a sensible default profile that is good for most containerized applications.

              Resource constraints

              Setting resource limits on containers helps to protect the host and other containers from misbehaving containers or Denial-of-Service attacks. You can finetune CPU, memory, IO limits, and ulimits (number of open files or processes) for every container. Furthermore, running containers with read-only root filesystems ensures only volumes can be written and helps when you have to inspect a compromised container.

                Development pipeline

                Container images should be handled like regular software packages. Only use images from a trusted source. Would you install an RPM/DEB package you found on a random forum? Images should be signed and verified. They should be automatically scanned for known vulnerabilities with tools such as Clair.

                Forgetting about leftover artifacts in layered Docker images used to be a common mistake. Check out our Docker Build Secrets challenge to see this in action. With multi-stage builds and squashing (merging every layer into only one), this is less of an issue today, but it is still important to bear in mind.


                  As you know containers share the host’s kernel but what if they didn’t have to? Unikernels are specialized, single-address-space machine images constructed by using library operating systems. With unikernels, applications can run like virtual machines without the overhead of an operating system. There are of course obstacles and they still haven’t gained widespread popularity. Maybe RedHat’s very recent UKL project, a unikernel based on Linux will change that.

                  Share this post on social media!

                  Related Articles

                  5 Steps your security program should include

                  5 Steps your security program should include

                  For most companies, security is considered a side quest, which is partly related to the daily processes. In reality, security ought to be a strong foundation of any organization. To ensure the defense of the enterprise, the relevant teams need strong security knowledge and abilities.

                  Getting started with Kotlin

                  Getting started with Kotlin

                  If you are working on Java projects you might have heard about other languages that run on the JVM, like Clojure, Kotlin, or Scala. Programmers like to try new things out but is it worth it to pick one of them over Java?