GitlabCI: expose Docker socket for faster builds

When building Docker images with GitlabCI, a very common issue is setting-up efficient Docker caching strategy for image build. That's when hell starts and your CI jobs gets either longer, complex or both.


Introduction

We'll talk a lot about Docker and Gitlab CI

Various kinds of setup exists, mostly based on Docker-in-Docker as a GitlabCI service:

  • Try to pull previously built image to use as cache for building the new one (as described in GitlabCI doc)
    • Not very efficient: it takes time to run DinD and pull the previous image, plus it's difficult to manage cache re-use across Git branches
  • Using Gitlab internal cache mechanism to keep some data between build and inject it somehow in Docker context to copy it in image (like using BuildKit mount )
    • Please don't. It will overcomplexify CI config, and you'll probably loose your build reproducibility.
  • Using GitlabCI shell executor
    • You loose the advantage of running jobs in Docker, having to ensure all binaries and such are available on Runner's host. Not practical.
  • Mouting underlying machine Docker socket into Gitlab
    • In my experience the best solution for most cases, and you'll avoid some DinD issues

Typical situation

Let's consider this typical CI config using Docker-in-Docker and trying to re-use cache by pulling existing image on branch before building:

docker-build:
  image: docker:latest
  stage: build
  # Run DinD service
  services:
    - docker:dind
  before_script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
  script:
    # Attempt to pull the image to Docker cache can be re-used
    - docker pull "$CI_REGISTRY_IMAGE$:$CI_COMMIT_REF_SLUG" || true
    - docker build -t "$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG" .
    - docker push "$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG"
    # And this is a simple example
    # Some build may become over-complex to include logic to re-use cache other branches, mix Docker cache with Gitlab cache, so on.

When running the job, runner will:

  • Start a DinD service (and wait for it to be available)
  • Attempt to pull any existing image (this may be long for big images)
  • Start Docker build after losing 1 or 2 min on the first two steps - and even more depending on how long runner setup takes (cloning Git repo, etc.)

This may become extremely frustrating when you just want to update a small test or made a typo which may be completely unrelated to Docker build. Simply re-running the EXACT same build will take a few minutes - ideally, this should be a few seconds.

As a CI best practice, we want our build to be as short as possible. Let's see a way to improve this.

Expose Docker socket in CI container

Configure your Gitlab Runner config.toml such as:

[[runners]]
  name = "Docker Daemon Runner tagged as 'dockerd'"
  url = "https://gitlab.com/"
  executor = "docker"
  # Runner Authentication token (see https://docs.gitlab.com/ee/api/runners.html) 
  token = "xxx"

  # Directories under which runner will save builds data and cache
  # Mounted in job's containers (see below)
  builds_dir = "/builds"
  cache_dir = "/cache"

  [runners.docker]
    # Volumes to mount in containers running CI jobs
    # Same syntax as docker -v flag
    volumes = [
        # Mount the Docker socket so we can use it in our build
        # This will allow our jobs access to Docker Daemon
        "/var/run/docker.sock:/var/run/docker.sock",
        # Specify builds and cache dir both as global config and as runners.docker volumes
        # to ensure builds and cache directories are shared across jobs
        "/cache",
        "/builds"
    ]

Our runner is tagged dockerd. You'll be able to run Docker build on CI by specifying it with tags: such as:

build:
  stage: build

  # Make sure to use a tag matching your Docker Daemon runner
  tags: [ dockerd ]

  # Specify an image with docker binaries installed
  # Using docker client official image is the easiest way
  # But you can use your own by installing docker on it
  image: docker:20.10.7
  before_script:
  - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY

  # Run our Docker build and tag image with current branch name
  # As all jobs are running with the same Docker daemon, cache will be re-used without any further config
  # No need for DinD services or complex logic to retrieve or re-use cache!
  script:
    - docker build -t "$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG" .
    - docker push "$CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG"
    # Yup, that's all

You can the the result on Gitlab.com gitlabci-dockerd-runner example project:

  • First job took 86s to run as no cache was present
  • Second job took only 15s to run and Docker cache was fully re-used without any further config
    $ docker build . -t $CI_REGISTRY_IMAGE:latest
    Step 1/3 : FROM alpine
    ---> 6dbb9cc54074
    Step 2/3 : RUN echo "A long step which takes a long time without Docker cache" > /msg && sleep 60
    ---> Using cache
    ---> b2c3e23f698d
    Step 3/3 : RUN echo "Typically when building your app or downloading dependencies..." > /msg2
    ---> Using cache
    ---> 69575e65f0e6
    Successfully built 69575e65f0e6

It's a step towards faster CI build, but there may be more. We'll tak about it in other posts.

Security: limit the scope of this runner!

Be aware that anyone running jobs on this runner will be able to access any container running on the machine, and even the machine itself, for example:

  script:
  # Access another container on machine
  - docker exec another-job-container-with-secrets sh -c 'echo $SECRET_PASSWORD'
  # Access files on the machine itself
  - docker run -v /var/company-secret:/tmp alpine do-nasty-stuff

To limit these risks, you can:

  • Limit the runner to a single project or group, and ensure only required people can run pipelines
  • Ensure only Docker build jobs are run on this machine (and other jobs with secrets are run on different runners)
  • Avoid using secrets in Docker build if possible, or if you do so, use secrets that won't be more exposed by running in a build
    • For example, using an NPM Read-only token for build already shared with all the dev team accessing the runner may be OK as they already have access

We'll talk a bit more about Docker security pattern in another post. (Will update with the link)

Full example on Github

Check out Crafteo Devops Examples for Gitlab runner setup and Gitlab.com gitlabci-dockerd-runner example project for an example usage of such runner.

Hi there! You went this far, maybe you'll want to subscribe?

Get mail notification when new posts are published. No spam, no ads, I promise.

Leave a Reply

Your email address will not be published. Required fields are marked *