Dockerfile guidelines and recommendations

This is a proposal of having a guideline and recommendations when a new Dockerfile is being developed. This document will cover the recommended best practices and methods for building efficient and secured images.

Problem Description

When a new Docker image is needed for a project it is created without no guidelines to follow, therefore some best practices are not followed and could lead to performance or security issues.

Background

When a new Docker image is created by a development team all DevOps team are included in the pull request. They give recommendations based on their experience but they do not have a written guideline.

Solution

This document covers the recommended best practices and methods for building efficient and secured images.

Create ephemeral containers

The image defined by your Dockerfile should generate containers that are as ephemeral as possible. By “ephemeral”, we mean that the container can be stopped and destroyed, then rebuilt and replaced with an absolute minimum set up and configuration.

Understand build context

When you issue a docker build command, the current working directory is called the build context. By default, the Dockerfile is assumed to be located here, but you can specify a different location with the file flag (-f). Regardless of where the Dockerfile lives, all recursive contents of files and directories in the current directory are sent to the Docker daemon as the build context. Inadvertently including files that are not necessary for building an image results in a larger build context and larger image size. This can increase the time to build the image, time to pull and push it, and the container runtime size. To see how big your build context is, look for a message like this when building your Dockerfile:

Sending build context to Docker daemon 187.8MB

Exclude with .dockerignore

To exclude files not relevant to the build (without restructuring your source repository) use a .dockerignore file. This file supports exclusion patterns similar to .gitignore files. For information on creating one, see the .dockerignore file.

Use multi-stage builds

Multi-stage builds allow you to drastically reduce the size of your final image, without struggling to reduce the number of intermediate layers and files.

Because an image is built during the final stage of the build process, you can minimize image layers by leveraging build cache.

Don’t install unnecessary packages

To reduce complexity, dependencies, file sizes, and build times, avoid installing extra or unnecessary packages just because they might be “nice to have.” For example, you don’t need to include a text editor in a database image.

Base images in Ebury ECR

To avoid third-party availability dependencies and for periodically scanning the base images of our Docker images, base images have to be pushed to our ECR instead of getting them from Docker Hub or any other repositories. Allowed base images are Ubuntu and Alpine. They must have a fixed version and be under maintenance.

Decouple applications

Each container should have only one concern. Decoupling applications into multiple containers makes it easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.

Minimize the number of layers

In older versions of Docker, it was important that you minimized the number of layers in your images to ensure they were performant. The following features were added to reduce this limitation:

  • Only the instructions RUN, COPY, ADD create layers. Other instructions create temporary intermediate images and do not increase the size of the build.

  • Where possible, use multi-stage builds, and only copy the artefacts you need into the final image. This allows you to include tools and debug information in your intermediate build stages without increasing the size of the final image.

Sort multi-line arguments

Whenever possible, ease later changes by sorting multi-line arguments alphanumerically. This helps to avoid duplication of packages and make the list much easier to update. This also makes PRs a lot easier to read and review. Adding a space before a backslash (\) helps as well.

Here’s an example from the buildpack-deps image:

RUN apt-get update && apt-get install -y \
  bzr \
  cvs \
  git \
  mercurial \
  subversion

Leverage build cache

When building an image, Docker steps through the instructions in your Dockerfile, executing each in the order specified. As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image.

If you do not want to use the cache at all, you can use the --no-cache=true option on the docker build command. However, if you do let Docker use its cache, it is important to understand when it can, and cannot, find a matching image. The basic rules that Docker follows are outlined below:

  • Starting with a parent image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the same instruction. If not, the cache is invalidated.

  • In most cases, simply comparing the instruction in the Dockerfile with one of the child images is sufficient. However, certain instructions require more examination and explanation.

  • For the ADD and COPY instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.

  • Aside from the ADD and COPY commands, cache checking does not look at the files in the container to determine a cache match. For example, when processing a RUN apt-get -y update command the files updated in the container are not examined to determine if a cache hit exists. In that case, just the command string itself is used to find a match.

Once the cache is invalidated, all subsequent Dockerfile commands generate new images and the cache is not used.

Allow permissions only on executables which need them

Ensure setuid and setgid permissions are removed in the images. Allow setuid and setgid permissions only on executables which need them. You could remove these permissions during build time by adding the following command in your Dockerfile, preferably towards the end of the Dockerfile:

RUN find / -perm +6000 -type f -exec chmod a-s {} \; || true

Drift Prevention

Prevent running executables not in original image.

Ensure secrets are not stored in Dockerfiles

Secrets will be passed as environment variables. Use AWS Secret Manager or preferably HashiCorp Vault for storing secrets that will be used in the Docker containers. If you need to preload them, use multi-stage builds in order to get them.

Ensure verified packages are only installed

Use GPG keys for downloading and verifying packages or any other secure package distribution mechanism of your choice.

Ensure privileged containers are not used

Do not run a container with the --privileged flag. For example, do not start a container as below:

docker run --interactive --tty --privileged ubuntu:18.04 /bin/bash

Ensure sensitive host system directories are not mounted on containers

Do not mount host sensitive directories on containers especially in read-write mode. This is a list of sensitive directories:

/
/boot
/dev
/etc
/lib
/proc
/sys
/usr/var/lib/docker

Sensitive data

Do not include sensitive data in Docker images, such as private RSA keys.

Ensure ssh is not run within containers

Uninstall SSH server from the container and use nsenter or any other commands such as docker exec or docker attach to interact with the container instance.

docker exec --interactive --tty $INSTANCE_ID sh
OR
docker attach $INSTANCE_ID

Ensure privileged ports are not mapped within containers

Privileged ports: The TCP/IP port numbers below 1024 are special in that normal users are not allowed to run servers on them. Do not map the container ports to privileged host ports when starting a container. Also, ensure that there is no such container to host privileged port mapping declarations in the Dockerfile.

Ensure only needed ports are open on the container

Fix the Dockerfile of the container image to expose only needed ports by your containerized application.

Ensure memory usage for the container is limited

Run the container with only as much memory as required.

Ensure CPU priority is set appropriately on the container

Manage the CPU shares between your containers.

Ensure 'on-failure' container restart policy is set to a limit

If a container is desired to be restarted of its own, then, for example, you could start the container as below:

docker run --detach --restart=on-failure:5 nginx

The on-failure policy allows you to tell Docker to restart a container if the exit code indicates an error but not if the exit code indicates success. You can specify a maximum number of times Docker will automatically restart the container, in the example is 5 times.

Ensure the container is restricted from acquiring additional privileges

You should start your container as below:

docker run --rm -it --security-opt=no-new-privileges ubuntu bash

Ensure the Docker socket is not mounted inside any containers

Ensure that no containers mount docker.sock as a volume.

-v /var/run/docker.sock:/var/run/docker.sock

Ensure SELinux security options are set, if applicable

If SELinux is applicable for your Linux OS, use it.

Ensure AppArmor Profile is enabled, if applicable

If AppArmor is applicable for your Linux OS, use it.

Prevent Override Default Configurations

Do not override these default configurations when containers are running:

  • Running without default seccomp profile (seccomp=unconfined)
  • Running without apparmor security profile (apparmor=unconfined)
  • Disabling SELinux separation (label=disable)

Use --init flag

The --init flag inserts a tiny init-process into the container as the main process, and handles reaping of all processes when the container exits. In addition, it prevents from fork bombs.

For example:

docker run --rm -ti --init --user 501:20 ubuntu bash 
I have no name!@faac6df0dee0:/$ ps -aef --forest
UID        PID  PPID  C STIME TTY          TIME CMD
501          1     0  0 10:41 pts/0    00:00:00 /sbin/docker-init -- bash
501          6     1  0 10:41 pts/0    00:00:00 bash
501          9     6  0 10:41 pts/0    00:00:00  \_ ps -aef --forest

Dockerfile instructions

These recommendations are designed to help you create an efficient and maintainable Dockerfile.

FROM

Dockerfile reference for the FROM instruction

Whenever possible, use current official images as the basis for your images.

ARG

Dockerfile reference for the ARG instruction

Impact on build caching

ARG variables are not persisted into the built image as ENV variables are. However, ARG variables do impact the build cache in similar ways. If a Dockerfile defines an ARG variable whose value is different from a previous build, then a “cache miss” occurs upon its first usage, not its definition. In particular, all RUN instructions following an ARG instruction use the ARG variable implicitly (as an environment variable), thus can cause a cache miss. All predefined ARG variables are exempt from caching unless there is a matching ARG statement in the Dockerfile.

For example, consider these two Dockerfile:

FROM ubuntu
ARG CONT_IMG_VER
RUN echo $CONT_IMG_VER
FROM ubuntu
ARG CONT_IMG_VER
RUN echo hello

If you specify --build-arg CONT_IMG_VER=<value> on the command line, in both cases, the specification on line 2 does not cause a cache miss; line 3 does cause a cache miss.

ARG CONT_IMG_VER causes the RUN line to be identified as the same as running CONT_IMG_VER=<value> echo hello, so if the changes, we get a cache miss.

Consider another example under the same command line:

FROM ubuntu
ARG CONT_IMG_VER
ENV CONT_IMG_VER $CONT_IMG_VER
RUN echo $CONT_IMG_VER

In this example, the cache miss occurs on line 3. The miss happens because the variable’s value in the ENV references the ARG variable and that variable is changed through the command line. In this example, the ENV command causes the image to include the value.

If an ENV instruction overrides an ARG instruction of the same name, like this Dockerfile:

FROM ubuntu
ARG CONT_IMG_VER
ENV CONT_IMG_VER hello
RUN echo $CONT_IMG_VER

Line 3 does not cause a cache miss because the value of CONT_IMG_VER is a constant (hello). As a result, the environment variables and values used on the RUN (line 4) doesn’t change between builds.

RUN

Dockerfile reference for the RUN instruction

  • Split long or complex RUN statements on multiple lines separated with backslashes to make your Dockerfile more readable, understandable, and maintainable.

  • Avoid RUN apt-get upgrade and dist-upgrade. If a package contained in the parent image is out-of-date, contact its maintainers. If you know there is a particular package, foo, that needs to be updated, use apt-get install -y foo to update automatically.

  • Always combine RUN apt-get update with apt-get install in the same RUN statement. For example:

RUN apt-get update && apt-get install -y \
    package-bar \
    package-baz \
    package-foo
  • Using apt-get update alone in a RUN statement causes caching issues and subsequent apt-get install instructions fail. For example, say you have a Dockerfile:
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y curl

After building the image, all layers are in the Docker cache. Suppose you later modify apt-get install by adding extra package:

FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y curl nginx

Docker sees the initial and modified instructions as identical and reuses the cache from previous steps. As a result the apt-get update is not executed because the build uses the cached version. Because the apt-get update is not run, your build can potentially get an outdated version of the curl and nginx packages.

Using RUN apt-get update && apt-get install -y ensures your Dockerfile installs the latest package versions with no further coding or manual intervention. This technique is known as "cache busting". You can also achieve cache-busting by specifying a package version. This is known as version pinning, for example:

RUN apt-get update && apt-get install -y \
    package-bar \
    package-baz \
    package-foo=1.3.*
  • Version pinning forces the build to retrieve a particular version regardless of what’s in the cache. This technique can also reduce failures due to unanticipated changes in required packages.

Below is a well-formed RUN instruction that demonstrates all the apt-get recommendations.

RUN apt-get update && apt-get install -y \
    aufs-tools \
    automake \
    build-essential \
    curl \
    dpkg-sig \
    libcap-dev \
    libsqlite3-dev \
    mercurial \
    reprepro \
    ruby1.9.1 \
    ruby1.9.1-dev \
    s3cmd=1.1.* \
 && rm -rf /var/lib/apt/lists/*

CMD

Dockerfile reference for the CMD instruction

The CMD instruction should be used to run the software contained by your image, along with any arguments. CMD should almost always be used in the form of CMD ["executable", "param1", "param2"…]. Thus, if the image is for a service, such as Apache and Rails, you would run something like CMD ["apache2","-DFOREGROUND"]. Indeed, this form of the instruction is recommended for any service-based image.

In most other cases, CMD should be given an interactive shell, such as bash, python and perl. For example, CMD ["perl", "-de0"], CMD ["python"], or CMD ["php", "-a"]. Using this form means that when you execute something like docker run -it python, you’ll get dropped into a usable shell, ready to go. CMD should rarely be used in the manner of CMD ["param", "param"] in conjunction with ENTRYPOINT, unless you and your expected users are already quite familiar with how ENTRYPOINT works.

ENV

Dockerfile reference for the ENV instruction

To make new software easier to run, you can use ENV to update the PATH environment variable for the software your container installs. For example, ENV PATH /usr/local/nginx/bin:$PATH ensures that CMD ["nginx"] just works.

The ENV instruction is also useful for providing required environment variables specific to services you wish to containerize, such as Postgres’s PGDATA.

Lastly, ENV can also be used to set commonly used version numbers so that version bumps are easier to maintain, as seen in the following example:

ENV PG_MAJOR 9.3
ENV PG_VERSION 9.3.4
RUN curl -SL http://example.com/postgres-$PG_VERSION.tar.xz | tar -xJC /usr/src/postgress && …
ENV PATH /usr/local/postgres-$PG_MAJOR/bin:$PATH

Similar to having constant variables in a program (as opposed to hard-coding values), this approach lets you change a single ENV instruction to auto-magically bump the version of the software in your container.

Each ENV line creates a new intermediate layer, just like RUN commands. This means that even if you unset the environment variable in a future layer, it still persists in this layer and its value can't be dumped.

ADD or COPY

Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.

If you have multiple Dockerfile steps that use different files from your context, COPY them individually, rather than all at once. This ensures that each step's build cache is only invalidated (forcing the step to be re-run) if the specifically required files change.

USER

Dockerfile reference for the USER instruction

If a service can run without privileges, use USER to change to a non-root user.

This is an example of how to create a non-root user in your Dockerfile:

ENV APP_USER user
ENV APP_GROUP user

WORKDIR /srv/app

RUN addgroup --system ${APP_GROUP} --gid 1000 && \
    adduser --system ${APP_USER} \
    --ingroup ${APP_GROUP} \
    --shell /usr/sbin/nologin \
    --uid 1000 \
    --home /srv/app

USER ${APP_USER}

WORKDIR

Dockerfile reference for the WORKDIR instruction

For clarity and reliability, you should always use absolute paths for your WORKDIR. Also, you should use WORKDIR instead of proliferating instructions like RUN cd … && do-something, which are hard to read, troubleshoot, and maintain.

Operation

This guideline has to be followed by the Dockerfile developer. Tools for checking it can be included in the continuous integration workflow to check them. * Aquasec Scanner images will be included in CI Workflow for checking security issues related to Docker images.

Security Impact

This guide aims to reduce security risks when Docker images are developed.

Performance Impact

This guide aims to improve the performance of Docker images when they are built.

Developer Impact

This guideline has to be followed by all developer teams.

References