Containerize This! How to build Golang Dockerfiles

Mike Zazon 2nd May 2018
Golang Dockerfiles

Welcome to Containerize This! – a series of blog posts which will showcase building and running example applications written in a specific programming language or framework inside of a Docker container.

We’ll go over some best practices in authoring a Dockerfile, special considerations you need to make for that language/framework, how to manage that container in a CI/CD pipeline, how to configure your host machine or orchestrator to run that container, monitor app health in production, and other topics as they relate!

In this post, we’re going to put a simple Golang application into a Docker container, while looking at some Dockerfile best practices along the way. Docker provides some great build time features & base images that we can use to achieve lightweight, secure and efficient application builds. We’ll see why Golang is a great language to demonstrate these features because of the way it can compile to a single binary (or set of binaries). The key theme we’re going to focus on with this example is minimalism! While the examples are very basic, they are very important, and you’ll be able to build on these concepts while adding additional best practices to your larger Golang projects for security and efficiency.

We’ll use this simple main.go to demonstrate these concepts:

package main
import "fmt"
func main() {
 fmt.Println("Hello Cloudreach!")
}

 

Minimize layers to maximize efficiency: Best practices

It’s a best practice to minimize layers, Docker says this right away in the Dockerfile docs! This is an important concept to get right at the start.

You could easily go down a path of writing a Dockerfile with many layers – the syntax can lend itself to that – building in a lot of inefficiencies without realizing. It’s a best practice to group && chain together relevant stages of a build, like downloading dependencies, vendor folder integration or setting up a build environment using RUN commands.

You’ll also have to consider which parts of the groupings can change often, and then group these together in layers as low in the Dockerfile as possible, while placing more static build dependencies, build environment configuration, or application assets located as far up the Dockerfile as possible.

Each layer–more specifically each line in a Dockerfile that starts with an instruction – is hashed and built out on top of one and other, and the final image is constructed of the “stacked” layers. Since each layer of a Dockerfile inherits from the next, the build cache provides a wonderful mechanism to skip over what is already built or static, and move on to the parts that actually need to be built and re-hashed!

Minimizing build times is crucial, since a well-oiled CI/CD system will be running these builds many times per day. When this scales to large teams this can mean a lot of building, perhaps a lot of jenkins workers, and sometimes a long wait for a developer’s code to be integrated! Docker has some build caching features that can really save build-time steps. Less time waiting for builds means faster integration, automated testing, and velocity up the CI/CD pipeline so your process can support your team’s scale.

It’s also important to leverage a separate non-root user for the application whenever possible. This can be as simple as using a RUN instruction for the linux adduser command, and then a USER instruction in the Dockerfile to use that specific user to run the binary.

An example minimal Dockerfile for our example main.go–a basic main function and no external dependencies–might look like this when constructed using these best practices:

FROM golang:alpine
RUN mkdir /app 
ADD . /app/
WORKDIR /app 
RUN go build -o main .
RUN adduser -S -D -H -h /app appuser
USER appuser
CMD ["./main"]

When built, this produces an image size of 378MB:

$  docker build -t hellocloudreachmain:1.0 . -f Dockerfile.single
... (build output omitted)
$  docker images | grep hellocloudreachmain
hellocloudreachmain                1.0 d1c5090585bc  Less than a second ago 378MB

Alpine-based official images are great to use whenever possible! It’s one of the lightest-weight linux distributions built on busybox and musl. Container image size is minimal compared to a heavier distribution, and it’s a good way to get efficiency right off the bat. But we can do more to minimize! These official image builds like golang:alpine all contain a lot of layers which have really secure components within and are great for building application assets, but if we don’t need these to run the application, then we don’t put them there! We’ll need to leverage some additional Docker build features to take a step further into minimalism.

 

Minimize Footprint: Multistage

When we talk about running any application in a production environment, we need to design containers to be performant and secure. We’ll also want as much portability as possible so we can move these around easily and schedule them en masse using orchestrators like Docker Swarm and Kubernetes. Push and pulling of images to registries should take the least amount of time possible. One of the most important things to keep in mind when authoring a Dockerfile for production is minimalism in the final runtime image.

If it doesn’t need to be there to run it, don’t put it there!

In developer environments, sometimes a “heavier” container image is needed, perhaps a Developer-specific Dockerfile that is used since it’s common to throw in tools that one might need alongside the container for debugging and other development activities. Likely also attaching or persisting a volume or two for interacting with the live container. This is common of course! As the container image moves up the pipeline, however, it’s important to remove these right out of the gate. A proper secure software supply chain would enforce building the final image as early as possible in the pipeline, sign the image, and promote the officially signed image to each stage into production. For this reason, you want to have this minimal image built, validated, integrated, and signed as early as possible in your supply chain; this is the Dockerfile developers, QA team, security engineers must be familiar with! In other words, build once, and let your process take that built image up to production.

Multi-stage builds are a great way to achieve this! The basic principle involved with Multi-stage involves invoking a temporary container which can facilitate the application build, then copying the built assets out of that space into a container image that has the least amount of components required to run the app. An example Dockerfile extending the previous example might look like:

FROM golang:alpine as builder
RUN mkdir /build 
ADD . /build/
WORKDIR /build 
RUN go build -o main .
FROM alpine
RUN adduser -S -D -H -h /app appuser
USER appuser
COPY --from=builder /build/main /app/
WORKDIR /app
CMD ["./main"]

Notice two FROM directives in this Dockerfile. The first one we label as “builder”, using it to build the application. We then use a second FROM, this time pulling from base “alpine” (very lightweight!) and copy our built executable from that environment to this new environment. This results in an image size MUCH smaller than the previous! In addition, the “builder” container is cached in the docker builder context, so that build cache speed can be leveraged similarly to the previous example!

$  docker build -t hellocloudreachmain:1.1 . -f Dockerfile.multi
... (build output omitted)
$  docker images | grep hellocloudreachmain
hellocloudreachmain                1.0 d1c5090585bc  8 minutes ago 378MB
hellocloudreachmain                1.1 ea737df5cc64 Less than a second ago    6.16MB

The size of this image is 6.16 MB. Not a bad reduction from 378MB!

 

Minimize the entire runtime environment… build FROM scratch!

There’s actually more we can do to minimize further. One of the many interesting things about Golang is that you can compile to a single binary, and in most cases include all of the relevant libraries statically compiled right into that binary using some special build-time arguments. This enables us to construct a very minimal Docker container with less excess runtime overhead for the great performance, portability, and security that we are looking for!

If we can compile a Golang app to a single binary, of which is statically-linked to its dependencies, we can leverage a 0KB container to run this application. This is a special base image provided by Docker called “scratch”: FROM scratch

A question always comes up during our Docker training class sessions: Do all containers have an operating system inside? The quick answer is no, because of this special type! This image does not have an associated supported operating system environment inside. There are special requirements, most importantly the architecture of your host must support the architecture of the compiled binary (x86, x64, etc), then you can actually use a container which provides no features or support to the application other than the isolation and great features Docker containers provide! Where possible, using scratch as your base image will provide extreme levels of minimalism and security for your application containers.

FROM golang:alpine as builder
RUN mkdir /build 
ADD . /build/
WORKDIR /build 
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-extldflags "-static"' -o main .
FROM scratch
COPY --from=builder /build/main /app/
WORKDIR /app
CMD ["./main"]

On line 6, FROM scratch tells Docker to start over like we saw in the previous multi-stage example, but this time using the 0KB scratch image. The first stage looks similar to the previous, but this time we use some compile-time parameters in the build stage to instruct the go compiler to statically link the runtime libraries into the binary itself:

RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-extldflags "-static"' -o main.

The final Docker image will only contain this one single executable in this example, without the baggage of a container operating system.

$ docker build -t hellocloudreachmain:1.2 . -f Dockerfile.scratch
... (build output omitted)
$ docker images | grep hellocloudreachmain
hellocloudreachmain 1.0 d1c5090585bc 8 minutes ago 378MB
hellocloudreachmain 1.1 ea737df5cc64 4 minutes ago 6.16MB
hellocloudreachmain 1.2 bda5c99404ae 33 seconds ago 2.01MB

Wow! A 2.01 MB resulting container size! This is a huge reduction from the initial 378MB image!

So do these actually work?

$  docker run -it hellocloudreachmain:1.0
Hello Cloudreach!
$  docker run -it hellocloudreachmain:1.1
Hello Cloudreach!
$  docker run -it hellocloudreachmain:1.2
Hello Cloudreach!

Yes!

 

Conclusion

We took a basic Dockerfile and incrementally improved it a couple times by minimizing the final image size each time. From this simple exercise, it’s easy to see there are many options to achieve minimalism with your Docker builds when building Golang applications.

There are ways to get really minimal containers using the features of the language and what it provides to a developer during compilation time. Since Golang can compile down to a statically-linked executable, we can leverage features to strip out all unnecessary components for runtime.

More complex applications & builds may not be able to follow the exact same design pattern, but these principles can be applied to most Golang Dockerfiles! Taking some time to be sure that Dockerfiles are constructed properly using industry best-practices will set you up for success in your journey to building fast, secure, and scalable applications in containers!