Dockerfile guidelines and recommendations
This is a proposal of having a guideline and recommendations when a new Dockerfile is being developed. This document will cover the recommended best practices and methods for building efficient and secured images.
Problem Description
When a new Docker image is needed for a project it is created without no guidelines to follow, therefore some best practices are not followed and could lead to performance or security issues.
Background
When a new Docker image is created by a development team all DevOps team are included in the pull request. They give recommendations based on their experience but they do not have a written guideline.
Solution
This document covers the recommended best practices and methods for building efficient and secured images.
Create ephemeral containers
The image defined by your Dockerfile should generate containers that are as ephemeral as possible. By “ephemeral”, we mean that the container can be stopped and destroyed, then rebuilt and replaced with an absolute minimum set up and configuration.
Understand build context
When you issue a docker build command, the current working directory is called the build context.
By default, the Dockerfile is assumed to be located here, but you can specify a different location with the file flag (-f).
Regardless of where the Dockerfile lives, all recursive contents of files and directories in the current directory are sent to the Docker daemon as the build context.
Inadvertently including files that are not necessary for building an image results in a larger build context and larger image size.
This can increase the time to build the image, time to pull and push it, and the container runtime size.
To see how big your build context is, look for a message like this when building your Dockerfile:
Sending build context to Docker daemon 187.8MB
Exclude with .dockerignore
To exclude files not relevant to the build (without restructuring your source repository) use a .dockerignore file.
This file supports exclusion patterns similar to .gitignore files. For information on creating one, see the .dockerignore file.
Use multi-stage builds
Multi-stage builds allow you to drastically reduce the size of your final image, without struggling to reduce the number of intermediate layers and files.
Because an image is built during the final stage of the build process, you can minimize image layers by leveraging build cache.
Don’t install unnecessary packages
To reduce complexity, dependencies, file sizes, and build times, avoid installing extra or unnecessary packages just because they might be “nice to have.” For example, you don’t need to include a text editor in a database image.
Base images in Ebury ECR
To avoid third-party availability dependencies and for periodically scanning the base images of our Docker images, base images have to be pushed to our ECR instead of getting them from Docker Hub or any other repositories. Allowed base images are Ubuntu and Alpine. They must have a fixed version and be under maintenance.
Decouple applications
Each container should have only one concern. Decoupling applications into multiple containers makes it easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.
Minimize the number of layers
In older versions of Docker, it was important that you minimized the number of layers in your images to ensure they were performant. The following features were added to reduce this limitation:
-
Only the instructions
RUN,COPY,ADDcreate layers. Other instructions create temporary intermediate images and do not increase the size of the build. -
Where possible, use multi-stage builds, and only copy the artefacts you need into the final image. This allows you to include tools and debug information in your intermediate build stages without increasing the size of the final image.
Sort multi-line arguments
Whenever possible, ease later changes by sorting multi-line arguments alphanumerically. This helps to avoid duplication of packages and make the list much easier to update. This also makes PRs a lot easier to read and review. Adding a space before a backslash (\) helps as well.
Here’s an example from the buildpack-deps image:
RUN apt-get update && apt-get install -y \
bzr \
cvs \
git \
mercurial \
subversion
Leverage build cache
When building an image, Docker steps through the instructions in your Dockerfile, executing each in the order specified. As each instruction is examined, Docker looks for an existing image in its cache that it can reuse, rather than creating a new (duplicate) image.
If you do not want to use the cache at all, you can use the --no-cache=true option on the docker build command.
However, if you do let Docker use its cache, it is important to understand when it can, and cannot, find a matching image.
The basic rules that Docker follows are outlined below:
-
Starting with a parent image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the same instruction. If not, the cache is invalidated.
-
In most cases, simply comparing the instruction in the Dockerfile with one of the child images is sufficient. However, certain instructions require more examination and explanation.
-
For the
ADDandCOPYinstructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated. -
Aside from the
ADDandCOPYcommands, cache checking does not look at the files in the container to determine a cache match. For example, when processing aRUN apt-get -y updatecommand the files updated in the container are not examined to determine if a cache hit exists. In that case, just the command string itself is used to find a match.
Once the cache is invalidated, all subsequent Dockerfile commands generate new images and the cache is not used.
Allow permissions only on executables which need them
Ensure setuid and setgid permissions are removed in the images.
Allow setuid and setgid permissions only on executables which need them.
You could remove these permissions during build time by adding the following command in your Dockerfile,
preferably towards the end of the Dockerfile:
RUN find / -perm +6000 -type f -exec chmod a-s {} \; || true
Drift Prevention
Prevent running executables not in original image.
Ensure secrets are not stored in Dockerfiles
Secrets will be passed as environment variables. Use AWS Secret Manager or preferably HashiCorp Vault for storing secrets that will be used in the Docker containers. If you need to preload them, use multi-stage builds in order to get them.
Ensure verified packages are only installed
Use GPG keys for downloading and verifying packages or any other secure package distribution mechanism of your choice.
Ensure privileged containers are not used
Do not run a container with the --privileged flag. For example, do not start a container as below:
docker run --interactive --tty --privileged ubuntu:18.04 /bin/bash
Ensure sensitive host system directories are not mounted on containers
Do not mount host sensitive directories on containers especially in read-write mode. This is a list of sensitive directories:
/
/boot
/dev
/etc
/lib
/proc
/sys
/usr/var/lib/docker
Sensitive data
Do not include sensitive data in Docker images, such as private RSA keys.
Ensure ssh is not run within containers
Uninstall SSH server from the container and use nsenter or any other commands such as docker exec or
docker attach to interact with the container instance.
docker exec --interactive --tty $INSTANCE_ID sh
OR
docker attach $INSTANCE_ID
Ensure privileged ports are not mapped within containers
Privileged ports: The TCP/IP port numbers below 1024 are special in that normal users are not allowed to run servers on them. Do not map the container ports to privileged host ports when starting a container. Also, ensure that there is no such container to host privileged port mapping declarations in the Dockerfile.
Ensure only needed ports are open on the container
Fix the Dockerfile of the container image to expose only needed ports by your containerized application.
Ensure memory usage for the container is limited
Run the container with only as much memory as required.
Ensure CPU priority is set appropriately on the container
Manage the CPU shares between your containers.
Ensure 'on-failure' container restart policy is set to a limit
If a container is desired to be restarted of its own, then, for example, you could start the container as below:
docker run --detach --restart=on-failure:5 nginx
The on-failure policy allows you to tell Docker to restart a container if the exit code indicates an error
but not if the exit code indicates success.
You can specify a maximum number of times Docker will automatically restart the container, in the example is 5 times.
Ensure the container is restricted from acquiring additional privileges
You should start your container as below:
docker run --rm -it --security-opt=no-new-privileges ubuntu bash
Ensure the Docker socket is not mounted inside any containers
Ensure that no containers mount docker.sock as a volume.
-v /var/run/docker.sock:/var/run/docker.sock
Ensure SELinux security options are set, if applicable
If SELinux is applicable for your Linux OS, use it.
Ensure AppArmor Profile is enabled, if applicable
If AppArmor is applicable for your Linux OS, use it.
Prevent Override Default Configurations
Do not override these default configurations when containers are running:
- Running without default seccomp profile (seccomp=unconfined)
- Running without apparmor security profile (apparmor=unconfined)
- Disabling SELinux separation (label=disable)
Use --init flag
The --init flag inserts a tiny init-process into the container as the main process,
and handles reaping of all processes when the container exits.
In addition, it prevents from fork bombs.
For example:
docker run --rm -ti --init --user 501:20 ubuntu bash
I have no name!@faac6df0dee0:/$ ps -aef --forest
UID PID PPID C STIME TTY TIME CMD
501 1 0 0 10:41 pts/0 00:00:00 /sbin/docker-init -- bash
501 6 1 0 10:41 pts/0 00:00:00 bash
501 9 6 0 10:41 pts/0 00:00:00 \_ ps -aef --forest
Dockerfile instructions
These recommendations are designed to help you create an efficient and maintainable Dockerfile.
FROM
Dockerfile reference for the FROM instruction
Whenever possible, use current official images as the basis for your images.
ARG
Dockerfile reference for the ARG instruction
Impact on build caching
ARG variables are not persisted into the built image as ENV variables are.
However, ARG variables do impact the build cache in similar ways.
If a Dockerfile defines an ARG variable whose value is different from a previous build, then a “cache miss” occurs upon its first usage,
not its definition.
In particular, all RUN instructions following an ARG instruction use the ARG variable implicitly (as an environment variable),
thus can cause a cache miss.
All predefined ARG variables are exempt from caching unless there is a matching ARG statement in the Dockerfile.
For example, consider these two Dockerfile:
FROM ubuntu
ARG CONT_IMG_VER
RUN echo $CONT_IMG_VER
FROM ubuntu
ARG CONT_IMG_VER
RUN echo hello
If you specify --build-arg CONT_IMG_VER=<value> on the command line, in both cases,
the specification on line 2 does not cause a cache miss; line 3 does cause a cache miss.
ARG CONT_IMG_VER causes the RUN line to be identified as the same as running CONT_IMG_VER=<value> echo hello, so if the
Consider another example under the same command line:
FROM ubuntu
ARG CONT_IMG_VER
ENV CONT_IMG_VER $CONT_IMG_VER
RUN echo $CONT_IMG_VER
In this example, the cache miss occurs on line 3.
The miss happens because the variable’s value in the ENV references the ARG variable and that variable is changed through the command line.
In this example, the ENV command causes the image to include the value.
If an ENV instruction overrides an ARG instruction of the same name, like this Dockerfile:
FROM ubuntu
ARG CONT_IMG_VER
ENV CONT_IMG_VER hello
RUN echo $CONT_IMG_VER
Line 3 does not cause a cache miss because the value of CONT_IMG_VER is a constant (hello).
As a result, the environment variables and values used on the RUN (line 4) doesn’t change between builds.
RUN
Dockerfile reference for the RUN instruction
-
Split long or complex
RUNstatements on multiple lines separated with backslashes to make yourDockerfilemore readable, understandable, and maintainable. -
Avoid
RUN apt-get upgradeanddist-upgrade. If a package contained in the parent image is out-of-date, contact its maintainers. If you know there is a particular package,foo, that needs to be updated, useapt-get install -y footo update automatically. -
Always combine
RUN apt-get updatewithapt-get installin the sameRUNstatement. For example:
RUN apt-get update && apt-get install -y \
package-bar \
package-baz \
package-foo
- Using
apt-get updatealone in aRUNstatement causes caching issues and subsequentapt-get installinstructions fail. For example, say you have a Dockerfile:
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y curl
After building the image, all layers are in the Docker cache. Suppose you later
modify apt-get install by adding extra package:
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y curl nginx
Docker sees the initial and modified instructions as identical and reuses the
cache from previous steps. As a result the apt-get update is not executed
because the build uses the cached version. Because the apt-get update is not
run, your build can potentially get an outdated version of the curl and
nginx packages.
Using RUN apt-get update && apt-get install -y ensures your Dockerfile
installs the latest package versions with no further coding or manual
intervention. This technique is known as "cache busting". You can also achieve
cache-busting by specifying a package version. This is known as version pinning,
for example:
RUN apt-get update && apt-get install -y \
package-bar \
package-baz \
package-foo=1.3.*
- Version pinning forces the build to retrieve a particular version regardless of what’s in the cache. This technique can also reduce failures due to unanticipated changes in required packages.
Below is a well-formed RUN instruction that demonstrates all the apt-get
recommendations.
RUN apt-get update && apt-get install -y \
aufs-tools \
automake \
build-essential \
curl \
dpkg-sig \
libcap-dev \
libsqlite3-dev \
mercurial \
reprepro \
ruby1.9.1 \
ruby1.9.1-dev \
s3cmd=1.1.* \
&& rm -rf /var/lib/apt/lists/*
CMD
Dockerfile reference for the CMD instruction
The CMD instruction should be used to run the software contained by your
image, along with any arguments. CMD should almost always be used in the form
of CMD ["executable", "param1", "param2"…]. Thus, if the image is for a
service, such as Apache and Rails, you would run something like CMD
["apache2","-DFOREGROUND"]. Indeed, this form of the instruction is recommended
for any service-based image.
In most other cases, CMD should be given an interactive shell, such as bash,
python and perl. For example, CMD ["perl", "-de0"], CMD ["python"], or CMD
["php", "-a"]. Using this form means that when you execute something like
docker run -it python, you’ll get dropped into a usable shell, ready to go.
CMD should rarely be used in the manner of CMD ["param", "param"] in
conjunction with ENTRYPOINT, unless
you and your expected users are already quite familiar with how ENTRYPOINT
works.
ENV
Dockerfile reference for the ENV instruction
To make new software easier to run, you can use ENV to update the
PATH environment variable for the software your container installs. For
example, ENV PATH /usr/local/nginx/bin:$PATH ensures that CMD ["nginx"]
just works.
The ENV instruction is also useful for providing required environment
variables specific to services you wish to containerize, such as Postgres’s
PGDATA.
Lastly, ENV can also be used to set commonly used version numbers so that
version bumps are easier to maintain, as seen in the following example:
ENV PG_MAJOR 9.3
ENV PG_VERSION 9.3.4
RUN curl -SL http://example.com/postgres-$PG_VERSION.tar.xz | tar -xJC /usr/src/postgress && …
ENV PATH /usr/local/postgres-$PG_MAJOR/bin:$PATH
Similar to having constant variables in a program (as opposed to hard-coding
values), this approach lets you change a single ENV instruction to
auto-magically bump the version of the software in your container.
Each ENV line creates a new intermediate layer, just like RUN commands. This
means that even if you unset the environment variable in a future layer, it
still persists in this layer and its value can't be dumped.
ADD or COPY
Although ADD and COPY are functionally similar, generally speaking, COPY
is preferred. That’s because it’s more transparent than ADD. COPY only
supports the basic copying of local files into the container, while ADD has
some features (like local-only tar extraction and remote URL support) that are
not immediately obvious. Consequently, the best use for ADD is local tar file
auto-extraction into the image, as in ADD rootfs.tar.xz /.
If you have multiple Dockerfile steps that use different files from your
context, COPY them individually, rather than all at once. This ensures that
each step's build cache is only invalidated (forcing the step to be re-run) if
the specifically required files change.
USER
Dockerfile reference for the USER instruction
If a service can run without privileges, use USER to change to a non-root
user.
This is an example of how to create a non-root user in your Dockerfile:
ENV APP_USER user
ENV APP_GROUP user
WORKDIR /srv/app
RUN addgroup --system ${APP_GROUP} --gid 1000 && \
adduser --system ${APP_USER} \
--ingroup ${APP_GROUP} \
--shell /usr/sbin/nologin \
--uid 1000 \
--home /srv/app
USER ${APP_USER}
WORKDIR
Dockerfile reference for the WORKDIR instruction
For clarity and reliability, you should always use absolute paths for your
WORKDIR. Also, you should use WORKDIR instead of proliferating instructions
like RUN cd … && do-something, which are hard to read, troubleshoot, and
maintain.
Operation
This guideline has to be followed by the Dockerfile developer. Tools for checking it can be included in the continuous integration workflow to check them. * Aquasec Scanner images will be included in CI Workflow for checking security issues related to Docker images.
Security Impact
This guide aims to reduce security risks when Docker images are developed.
Performance Impact
This guide aims to improve the performance of Docker images when they are built.
Developer Impact
This guideline has to be followed by all developer teams.