Managing Upgrades
In order to better manage some of our technical debt & mitigate security concerns we need to ensure we are consistently using the currently maintained versions of the software dependencies we rely on to build our solutions.
Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
- "GuestOS" refers to operating system within a Docker image e.g., Ubuntu.
- "GuestOS Version" refers to the version of GuestOS e.g.,
20.04.2 LTS. - "Runtime" refers to the primary language runtime used within the container e.g., Python.
- "Runtime Version" refers to the version of the Runtime e.g.,
3.8. - "Framework" refers to the primary framework used by the Runtime e.g., Django, Flask, etc.
- "Framework Version" refers to the version of Framework e.g.,
3.2 LTS.
Problem Description
Although we frequently choose the most current version of the available software dependencies - GuestOS, Runtime, and Framework - when creating a new service, those choices are inconsistent and we rarely revisit those choices to upgrade or replace them.
This creates three problems for our future selves:
- We just don't know all the different software and versions in use in our estate.
- These versions are not being kept up-to-date, exposing us to potential bugs and security vulnerabilities.
- Deferring these upgrades creates a combined technical debt that can be greater than the sum of its parts, which cannot easily - if at all - be addressed.
Background
A manual survey of services under Code Ownership showed a wide range of dependencies and versions in use with a number of them being out of support.
In addition there has been little guidance or constraint on the dependency choices, resulting in an eclectic mix.
Solution
The result we want to achieve is that only approved versions are deployed into any given environment. This will be performed through:
- Creating "blessed" Docker images consisting of approved GuestOS and Runtime1
- Implementing automated checks on shipping projects
Assumptions
In order to implement the checks described in this document effectively, they will need to be integrated into our CI/CD pipelines. There is an assumption that these checks can be created using AquaSec, but the integration of that tool into our builds will be covered in a separate RFC.
This combination of approaches will help Ebury to resolve this situation and avoid it in the future.
Constraints
We need to define - and enforce - specific constraints regarding dependency choice. Anything that deviates from these constraints should be considered technical debt and addressed as such.
Versions
In all cases, versions should be pinned to the loosest acceptable version. For
example pinning Ubuntu to == 20.04.2 is too strict, but >= 20.04.1 is
acceptable.
GuestOS
The standard GuestOS shall be Ubuntu Core which is readily available through the official Docker images at: https://hub.docker.com/_/ubuntu
Acceptable GuestOS versions shall be the current and current-1 Long Term Support (LTS) releases, and must be upgraded to the latest point release. At the time of writing, this includes the following approved releases:
- Ubuntu
20.04.2(Focal Fossa) - Ubuntu
18.04.5(Bionic Beaver)
New LTS releases can be adopted after they reach their first point release, at which point the oldest LTS is no longer approved.
New LTS point releases shall supercede the previous approved version.
Exceptions to Standard GuestOS
In certain cases the following exceptions to the standard GuestOS the following images are permitted as "FROM" in the Dockerfile:
- ubi8/ubi-minimal: exception applies in the case of the Ebury Keycloak image
- node:${NODE_LTS_VERSION}-${DEBIAN_LTS_VERSION}: exception applies in the case of Ebury Node.js images
Rationale
Ubuntu is based upon Debian and the available packages are broadly equivalent in terms of availability and versions, so compatibility shouldn’t be a factor for services that need to migrate. More importantly it provides both Long Term Support releases and commercial support if we require it. In addition, the use of Ubuntu Core removes most concerns re: image size.
Runtime & Framework
Where the desired Runtime and/or Framework has an official LTS, the current and current-1 releases shall be approved on a rolling basis, as with GuestOS.
Where there are no LTS releases, the most recent version with greater than twelve months of support left shall be approved. If there are no appropriate releases, then an alternative must be sought.
Example - Python
At the time of writing, Python 3.7.11, 3.8.11, and 3.9.6 are approved,
although only 3.9 is encouraged for new projects. Python 3.6 has less
than twelve months of support left, so cannot be used for new projects and
existing projects must plan to migrate.
Example - Django
At the time of writing the current LTS version is 3.2.4, and current-1 is
2.2.24. Both are approved, but 3.2 is encouraged for new projects.
Example - Node
At the time of writing, Node v14.17.1 and v12.21.1 are the current and
current-1 LTS releases respectively.
Implementation
The end goal is that all existing and new projects will comply with the above constraints. In order to assist our engineers with this effort, we will need to to work through the following phases:
- Adoption
- Informing/Warning
- Enforcement
- Reconciliation
Adoption
In order for AquaSec to make decisions on whether approved versions are being
used we need to ensure "blessed" base images are being used by our projects.
This will require the production of these images for a number of combinations of
the GuestOS and Runtime e.g., Ubuntu 20.04.02 and Python 3.7.10. These
images will need to be published to an accessible repository for use by both
developers and our CI/CD pipelines.
At this point developers will be encouraged to adopt the appropriate image for their projects.
Informing
Once image(s) are available, a policy will be added to AquaSec to warn if the image being scanned is not using a "blessed" base image.
At this point developers will be encouraged to request additional images to be created if the existing images do meet their requirements. However in the interests of efficiency, not all will requests will be granted and so there may be effort required from the developers to migrate their projects to an existing image.
Enforcement
Following an agreed window for the Informing phase, the policy will be updated to fail/block CI/CD pipelines if they are not using a "blessed" base image
Reconciliation
Integrating these checks into the build pipeline will provide a good amount of cover, but in order to improve this cover we need to add three additional processes:
- Compare running containers to their latest images to ensure the correct image is in use.
- Configure AquaSec to scan running containers with the same policies.
- Periodic rebuilds and deploys of base images and all services. This will ensure that any security releases from upstream providers are deployed in a timely fashion for stable/mature projects.
Alternatives
The primary alternative to this approach is to require that all Runtimes and Frameworks are installed through the GuestOS's package management making it easy to inspect the versions in use.
Caveats
While this document presents an ideal situation that we should be able to achieve for the majority of our services, we also need to accept that for various reasons we will not be able to apply these changes to all systems in a timely manner if at all. However these services need to be the exception and not the rule.
To that end, this document should be considered exception based; in order to not comply, the code owners for a given service must request and receive an exception to not comply with this document. Exceptions shall be granted by the Security Working Group, and requests shall be requested directly to that group.
In addition, this should only be considered the first phase. Follow-up improvements include expanding these checks to modules and libraries used by the Runtime/Framework.
Finally this approach works for the majority of our projects that use a single Runtime/Framework. Any projects that don't meet this (e.g., EBO) will need to determine and identify their primary Runtime and Framework.
Operation
SVP Engineering will approve the approved versions lists, with input from Engineering Leadership and Security.
Platform will provide the ability to create/maintain/monitor the required policies, and create/maintain the "blessed" base images.
Security Exceptions Board will review and approve all exception requests.
Security Impact
This will improve our overall security without relying on individual teams and engineers to be aware of required updates or monitor security announcements.
Performance Impact
There may be slight increase in build times for the additional checks.
The largest impact to performance will be through potential blocking of builds/deployments, but the phased implementation described should mitigate this.
Developer Impact
Developers will need to adopt the "blessed" base images into their projects, and plan to action required engineering work to do so.
Data Consumer Impact
None.
Deployment
This is covered in the Implementation section, but a more concrete timeline will need to be determined/
Dependencies
TBC (AquaSec RFC)
References
-
Frameworks may be added at a later date. For now they should be managed through either the GuestOS or Runtime package management tools. ↩