Placing ARGs in Dockerfiles

In general, Docker’s build caching significantly speeds up image builds. This is especially true for repeated builds of the same image with small changes. It is pretty well known, that every line in a Dockerfile introduces a new Layer that can be cached and reused in a following build if the line in the Dockerfile did not change.

I was just building a image recently where every build took forever, even though I only added new Layers at the end of the Dockerfile. The Dockerfile looked something like this:

FROM debian:bookworm-slim
ARG SOME_INTERNAL_ID

RUN apt update -y && apt install -y ...

...
<Here I added application code and was trying different things>

The weird thing was that the apt install command was executed every time and I couldn’t explain why. To solve a completely unrelated issue, I read the documentation for ARGs and found a section called Impact on build caching. And as it turns out, every RUN implicitly depends on all preceding ARGs, because they get passed as environment variable. This will cause a cache miss on every RUN instruction when the argument passed to the build changes.

In hindsight this behavior makes perfect sense, as any script used in a RUN instruction could depend on the argument being available. Still I found it worth sharing that you should define all ARGs as closely to the point they’re first used as possible to get the most out of build caching. I’m kinda torn about this. Waiting forever for each build to finish isn’t my favorite thing, but I also dig having all the arguments in one place to keep my Dockerfiles maintainable. So there acutally isn’t any clear advice, just the hint to think about this when writing Dockerfiles.

Published by lukaspanni on 12. April 202412. April 2024

0 Comments

Leave a Reply Cancel reply

More on the WP disaster

What the case of WP Engine can Teach us about Open Source Risk Management

Don’t be lazy when dealing with issues