Multi-Stage Build for CI/CD Pipeline using Dockerfile

Mon, 10/06/2024 | 5 mins

Satish Annavar

Angita Shah

Multi-Stage Build for CI/CD Pipeline using Dockerfile | Platform Engineers

Dockerfile builds create monolithic images containing the entire application environment – application code, build tools, and runtime dependencies. While simple to understand, this approach leads to bulky images with a significant drawback: inefficiency within CI/CD pipelines.

Multi-stage builds are necessary for Continuous Integration and Continuous Deployment (CI/CD) pipelines because they allow for more efficient and secure deployment processes. By breaking down the build process into stages, each stage can perform specific tasks, such as building the application, running tests, and creating a production-ready image. This approach reduces the risk of errors and vulnerabilities in the final image, as each stage can be tested and validated before moving on to the next stage. Multi-stage builds can reduce the size of the final image by removing unnecessary files and dependencies, improving the overall performance and security of the application.

Core Concepts:

Separation of Concerns: Build vs. Runtime Stages

Multi-stage builds establish a clear distinction between the build environment and the runtime environment:

Build Stage: Focuses on building the application. It typically utilizes a base image containing the necessary build tools (compilers, linkers) and libraries for the specific programming language (e.g., golang:1.19-alpine for Go). The application code is copied into this stage, and build commands are executed to compile, link, and generate application binaries or artifacts. These artifacts are then staged for inclusion in the final image.
Runtime Stage: Focuses on running the application. It utilizes a minimal base image, often a lightweight image like alpine:3.16. The required application artifacts are carefully selected and copied from the build stage into the runtime image. Any additional runtime dependencies (e.g., system libraries) are installed in this stage to ensure the application functions properly within the container. This final image serves as the basis for deploying your application.

Use Multi-Stage Builds

Multi-stage builds involve employing multiple `FROM` statements in a Dockerfile, with each statement initiating a new stage of the build. These stages can utilize different bases, and artifacts can be selectively copied from one stage to another, excluding unwanted elements in the final image.

Here's an example demonstrating the separation of build and runtime environments:

# syntax=docker/dockerfile:1
FROM golang:1.21 as build
WORKDIR /src
COPY <

In this example, the first stage utilizes Golang to compile the source code, while the second stage employs the `scratch` image, which contains only the compiled binary. This approach ensures a small, secure production image without superfluous build tools.

Name Your Build Stages

By default, stages are referred to by their integer number, starting with zero for the initial `FROM` instruction. However, stages can be named explicitly using the `AS` keyword, improving readability and resilience against reordering of instructions:


# syntax=docker/dockerfile:1
FROM golang:1.21 as build
WORKDIR /src
COPY <

Stop at a Specific Build Stage

When constructing an image, it is possible to stop at a particular stage using the `--target` flag during the build process. This capability is helpful for various scenarios, such as debugging a specific stage or managing distinct development and production environments:

$ docker build --target build -t hello .

Use External Image as a Stage

Stages in multi-stage builds are not confined to previous stages within the same Dockerfile; they can also reference external images. This feature allows for greater flexibility and modularity in the build process:

COPY --from=nginx:latest /etc/nginx/nginx.conf /nginx.conf

In multi-stage builds, you aren't limited to copying from stages created earlier in your Dockerfile. You can use the `COPY --from` instruction to copy from a separate image:

# syntax=docker/dockerfile:1


FROM alpine:latest AS builder
RUN apk --no-cache add build-base


FROM builder AS build1
COPY <

int main() {
    std::cout << "Hello from source1!" << std::endl;
    return 0;
}
EOF
RUN g++ -o /binary source1.cpp


FROM builder AS build2
COPY <


int main() {
    std::cout << "Hello from source2!" << std::endl;
    return 0;
}
EOF
RUN g++ -o /binary source2.cpp


FROM nginx:latest
COPY --from=build1 /binary /usr/share/nginx/html/index.html
COPY --from=build2 /binary /usr/share/nginx/html/login.html

Advanced Multi-Stage Build Techniques

Building Multiple Artifacts

Multi-stage builds can be used to build multiple artifacts in a single Dockerfile. This technique is particularly useful when dealing with microservices or applications composed of multiple components.

Consider the following example, demonstrating a multi-stage build for two Golang applications:

# syntax=docker/dockerfile:1

FROM golang:1.21 AS build1
WORKDIR /src1
COPY <

Here, two separate stages are used to build two distinct Golang applications. The final stage copies both binaries from their respective build stages and executes them together.

Building for Multiple Architectures

Multi-stage builds can also be used to build for multiple architectures. This technique is particularly useful when targeting different platforms or environments.

Consider the following example, demonstrating a multi-stage build for two different architectures:

# syntax=docker/dockerfile:1


FROM golang:1.21 AS build-amd64
WORKDIR /src
COPY <

Here, two separate stages are used to build the same application for two different architectures. The final stage copies both binaries from their respective build stages and executes them together.

Building image using BuildKit

To build the image using BuildKit, you would run the following command:

DOCKER_BUILDKIT=1 docker build -t hello

By setting `DOCKER_BUILDKIT=1`, you instruct Docker to use BuildKit for the build process. Without BuildKit, the Docker engine would attempt to build all stages leading up to the specified target, potentially wasting resources.

With BuildKit, only the stages that the target stage depends on are processed. In this case, since the `runtime` stage depends on the `build` stage, only those stages will be built. This leads to faster build times and reduced resource usage.

You can specify a target build stage using the `--target` option:

DOCKER_BUILDKIT=1 docker build --target build -t hello

This will stop at the `build` stage and skip the `runtime` stage, which may be useful for debugging purposes or when creating separate build and runtime images.

Optimizing CI/CD Workflows:

The final image size is significantly smaller compared to a traditional single-stage build, as unnecessary build tools and libraries are excluded. This translates to faster image transfers between the build server and container registry during the deployment process. Caching of build stages combined with the smaller final image size contributes to faster overall build times within CI/CD pipelines. Smaller images inherently reduce the attack surface, potentially minimizing security vulnerabilities within your containerized application. Explore best practices for writing efficient and secure Dockerfiles.

Considerations:

Multi-stage builds introduce additional complexity compared to single-stage builds. Understanding the separation of concerns and managing dependencies across stages becomes crucial.
Ensure the runtime base image has the necessary tools to handle tasks like copying files or setting environment variables.
Using minimal base images in the runtime stage might require additional security hardening steps to mitigate potential risks.