Quantcast
Channel: Andrew Lock | .NET Escapades
Viewing all articles
Browse latest Browse all 743

Combining multiple docker images into a multi-arch image

$
0
0

In this post I show how you can create multi-arch docker images by combining separate x64 and arm46 images into a single docker image. I described an easy way to create multi-arch images in a previous post. This post takes a slightly different approach, in that it supports the multi-arch images being created on different machines, and optionally from completely different dockerfiles.

What is a multi-arch docker image?

Docker has the concept of multi-architecture images, which means that a single Docker image tag can support multiple architectures. Typically different OS/processor architectures require different Docker images. With multi-arch images you specify a single image, and Docker will pull the appropriate architecture for your processor and platform.

For example, if you specify the .NET SDK docker image in your Dockerfile:

FROM mcr.microsoft.com/dotnet/sdk:5.0
#...

This docker image is a multi-arch image, so it actually contains a list of different docker images, one for each supported processor architecture. When you pull the image, docker pulls the appropriate x64, arm64, or arm32 image, based on your host architecture.

With the rise of arm64 machines, the need for creating multi-arch images is only increasing. Luckily docker includes an easy way to build multi-arch images.

Creating multi-arch docker images the easy way

In a previous post, I described the "easy" way to build multi-arch images using docker's buildx command. This requires a small amount of setup, but once it's working, compiling a dockerfile for multiple platforms is trivial by using the --platform option.

For example, by running the following command, I was able to compile my dockerfile for 8 different platforms and push the resulting multi-arch image to docker hub:

docker buildx build \
  -t andrewlock/wait-for-dependencies:latest
  --push \
  --platform linux/amd64,linux/arm64,linux/ppc64le,linux/s390x,linux/386,linux/arm/v7,linux/arm/v6 \
  .

As you can see, the published image has support for all those different architectures:

Multi-arch support for a docker image on Docker Hub

This works really well, but it doesn't work as well for some scenarios.

Sometimes the easy way isn't so easy

When you use the --platform tag with buildx, docker uses QEMU to emulate the architectures that don't match your host architecture. So if I'm on an x64 machine, and I'm trying to build for arm64 using --platform linux/arm64 then docker uses QEMU's software emulation to run any necessary binaries during your image build. This is frankly kind of magical, and is why it's so easy to build for all the other architectures.

Unfortunately, emulation is inherently much slower than using the host architecture. If you're just copying files around or doing relatively light tasks, then that's probably not a big problem. But if you're trying to do something compute-heavy like compilation, then you may find that the emulation is just too slow. If that's the case, you'll likely want to restrict building only for host architecture platforms, and using a different machine (with the appropriate architecture) for each target platform.

Another scenario where the simple buildx scenario may not work, is if you want to use fundamentally different dockerfiles for certain architectures. If you can't use a single dockerfile for your build, then you won't be able to use the multi-arch buildx approach described in the previous problem.

If the difference between your dockerfiles is small, you may be able to combine them into a single dockerfile by adding ARG TARGETPLATFORM to your file, relying on buildx to fill this variable, and adding checks in your dockerfile, as described in this post.

If you find yourself in either of these situations then you'll probably need to build each platform image separately and them combine them into a single multi-arch image. In the next section I'll show how to do that

Creating multi-arch images using docker manifest

In this section I'll show how we can create a multi-arch image from two independently built images. We're going to take the following steps:

  1. Build and push the single-architecture images using docker buildx
  2. Confirm the images were pushed successfully
  3. Use docker manifest to create and push a multi-arch image
  4. Confirm the multi-arch image was pushed successfully

Finally, we'll discuss some of the limitations used in this approach.

Building the single-architecture images

The first step is to build the single-platform images. For the images which have the same architecture as your host architecture you could use the "normal" build and push commands, for example:

docker build -t myimage:latest-x64 -f myimage.dockerfile .
docker push myimage:latest-x64

This approach is fine if you're building on the same underlying architecture, but it's generally recommended to use the buildx commands instead these days, so that's what I show below.

In the example below, I'm building the andrewlock/alpine-clang image. I'm using two different dockerfiles for the different architectures—alpine.build.dockerfile on x64 (amd64) and alpine.build.arm64.dockerfile on arm64—and running buildx build with a single platform passed to --platform.

First we have the command to build and push the x64 image:

docker buildx build \
  -t andrewlockdd/alpine-clang \
  -f alpine.build.dockerfile \
  --platform linux/amd64 \
  --provenance false \
  --output push-by-digest=true,type=image,push=true \
  .

This builds the x64 image and pushes it directly to docker hub.

Don't worry about the --provenance and --output parameters for now, we'll come back to them shortly

The command to build the arm64 image is essentially identical:

docker buildx build \
  -t andrewlockdd/alpine-clang \
  -f alpine.build.arm64.dockerfile \
  --platform linux/arm64 \
  --provenance false \
  --output push-by-digest=true,type=image,push=true \
  .

The build output from these commands looks something like the following:

[+] Building 3613.7s (14/15)                                                              docker-container:mybuilder
 => [internal] load build definition from alpine.build.dockerfile                                               0.0s
 => => transferring dockerfile: 1.46kB                                                                          0.0s
 => resolve image config for docker-image://docker.io/docker/dockerfile:1.6                                     1.0s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                0.0s
 => CACHED docker-image://docker.io/docker/dockerfile:1.6@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122  0.0s
 => => resolve docker.io/docker/dockerfile:1.6@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf  0.0s
 => [internal] load metadata for docker.io/library/alpine:3.14                                                  0.5s
 => [auth] library/alpine:pull token for registry-1.docker.io                                                   0.0s
 => [internal] load .dockerignore                                                                               0.0s
 => => transferring context: 55B                                                                                0.0s
 => [internal] load build context                                                                               0.0s
 => => transferring context: 54B                                                                                0.0s
 => [base 1/2] FROM docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a6733cb5ab1ac6e3cb4603a5dd56  0.0s
 => => resolve docker.io/library/alpine:3.14@sha256:0f2d5c38dd7a4f4f733e688e3a6733cb5ab1ac6e3cb4603a5dd564e5bf  0.0s
 => CACHED [base 2/2] RUN apk update         && apk upgrade         && apk add --no-cache         cmake         0.0s
 => CACHED [build-llvm-clang 1/2] COPY alpine_build_patch_llvm_clang.sh alpine_build_patch_llvm_clang.sh        0.0s
 => [build-llvm-clang 2/2] RUN git clone --depth 1 --branch llvmorg-16.0.6 https://github.com/llvm/llvm-pro  3151.6s
 => [final 1/1] RUN --mount=target=/llvm-project,from=build-llvm-clang,source=llvm-project,rw cd /llvm-projec  25.3s
 => exporting to image                                                                                        395.6s
 => => exporting layers                                                                                       145.4s
 => => exporting manifest sha256:9df972530f876295787deea7424db90cbd14d5a8fa602b2a3bce82977aa1025e               0.0s
 => => exporting config sha256:957c30684ee6d3d5defb646d5b2c19c7a57ebce4a485b1a47743afbc503e2d98                 0.0s
 => => pushing layers                                                                                         250.2s
 => [auth] andrewlockdd/alpine-clang:pull,push token for registry-1.docker.io                                   0.0s

Most of this output isn't particularly interesting, the one thing we're looking for is the sha256 digest for our image, which in the block above is sha256:9df972530f876295787deea7424db90cbd14d5a8fa602b2a3bce82977aa1025e. The important section of the arm64 image looks like this:

[+] Building 3.0s (11/11) FINISHED                                                        docker-container:mybuilder
 ...
 => exporting to image                                                                                          0.9s
 => => exporting layers                                                                                         0.0s
 => => exporting manifest sha256:038adbc4d6dc2e28f0818d5ae0fc1cae6cc42b854bd809f236435bed33f6ea63               0.0s
 => => exporting config sha256:36d2ae374bbe2d18a4ea8e275224be28d8d834b74fe24aec60a61be11f143e74                 0.0s
 => => pushing layers                                                                                           0.8s

If you go to hub.docker.com at this point and look at the tags, you might be surprised to see that it's empty 😬

Image of the docker hub UI andrewlockdd/alpine-clang showing there are no tags

So what's going on here, did the build work or not? 🤔

Where are the images?

There's no need to worry, the images were built and pushed to the image repository, just as the logs suggest. However, instead of being tagged (e.g. with :latest or :v1.0), they were pushed using their digest directly. We achieved that by adding the push-by-digest=true option to the --output parameter in the buildx command.

You might be wondering why we took that approach, instead of pushing with a tag like "normal". The answer is specifically to give the behaviour you've just seen: we don't want the "single architecture" images to be accessed as "normal".

If you don't want this approach, you could certainly give each architecture image a different tag, 1.0-x64 and 1.0-arm64 for example, and push these single-architecture images to the repository. These images would then show up as separate tags in your docker repository:

A single architecture tagged image

and we could later combine these into a single image (as we'll do in the next section). Personally I prefer the "digest" approach I showed originally, as it avoids cluttering up the tag list.

If you want some evidence that the image was pushed successfully, you can always inspect the manifest of the docker image, using the digest to reference the image. For example, the following pulls the manifest for the arm64 image we pushed previously:

docker manifest inspect andrewlockdd/alpine-clang@sha256:038adbc4d6dc2e28f0818d5ae0fc1cae6cc42b854bd809f236435bed33f6ea63

Note that to refer to a specific digest you use the @ separator, i.e. IMAGE@sha256:DIGEST whereas for tags you use the : separator: IMAGE:TAG

This returns the JSON manifest for the image, which describes the layers in the image and the configuration of the image.

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "digest": "sha256:36d2ae374bbe2d18a4ea8e275224be28d8d834b74fe24aec60a61be11f143e74",
    "size": 1103
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "digest": "sha256:4983c3fe2029a430985943e6d87b35248366efd28cee655acc3ebff5daf703fa",
      "size": 3339494
    },
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "digest": "sha256:ac6a4ede3e9444fba78ba2e47affc65870c27dd687f8b8ac494cc09bae09746c",
      "size": 253664375
    }
  ]
}

So our single-arch images have been built, now we just need to combine them into a single multi-arch image.

Combining single-arch images into a single multi-arch image

Before I show you the commands, it's worth understanding what a multi-arch image actually is. The docker manifest inspect command in the previous section shows how the docker image is a collection of layers of gzipped tar files, along with some configuration, all described and linked to by the manifest file.

A multi-arch image is described by a manifest list file. At it's heart, this is essentially a collection of manifests. So to create a multi-arch image, we need to create a manifest list that contains a pointer to each of the single-arch image manifests.

Luckily there's a built-in docker command for doing that, docker manifest create. The following command shows how we can create a multi-arch image (called andrewlockdd/alpine-clang:1.0) by pointing to each of our single-arch images using their digests:

docker manifest create andrewlockdd/alpine-clang:1.0 \
  --amend andrewlockdd/alpine-clang@sha256:038adbc4d6dc2e28f0818d5ae0fc1cae6cc42b854bd809f236435bed33f6ea63 \
  --amend andrewlockdd/alpine-clang@sha256:9df972530f876295787deea7424db90cbd14d5a8fa602b2a3bce82977aa1025e

If you opted for the "tags" approach using 1.0-arm64 etc, you can replace the digest references with the tags, i.e. instead of andrewlockdd/alpine-clang@sha256:038adbc4... use andrewlockdd/alpine-clang:1.0-x64.

If all went well, this prints Created manifest list docker.io/andrewlockdd/alpine-clang:1.0 and we can push the manifest list to docker hub:

docker manifest push andrewlockdd/alpine-clang:1.0
sha256:b27a1e358b1158c3e750dbbffc8ef93c5f17bfb4db6754d284953bab00e8f54a

And that's it, we have just created our multi-arch image! If we look on docker hub, we can see that the image has been created, and it supports both linux/amd64 and linux/arm64:

Image of the multi-arch image on docker hub

For completeness, lets check what the manifest looks like for our new multi-arch image by running

docker manifest inspect andrewlockdd/alpine-clang:1.0

The resulting JSON looks similar to the previous JSON, but this has a media type ending in manifest.list.v2+json (instead of manifest.v2+json), and it contains an array of manifests, because this is a manifest list document, not a manifest document:

 {
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 703,
         "digest": "sha256:038adbc4d6dc2e28f0818d5ae0fc1cae6cc42b854bd809f236435bed33f6ea63",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 904,
         "digest": "sha256:9df972530f876295787deea7424db90cbd14d5a8fa602b2a3bce82977aa1025e",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      }
   ]
}

So there we have it, a multi-arch image, built up from two individually built single-arch images.

There's one thing I haven't explained yet though: why did we set --provenance false when building the original images?

What about my --provenance?

Provenance attestations are a relatively new feature for docker. They include details such as:

  • Build timestamps
  • Build parameters and environment
  • Version control metadata
  • Source code details
  • Materials (files, scripts) consumed during the build

and can be used as part of documenting/securing your software supply chain. By default, in the latest versions of docker (and hence buildx), provenance attestation is enabled by default.

In general, this seems like a good idea, right? Absolutely, but unfortunately it messes with the process I described above 😅

The problem is that if you run docker buildx build and don't set --provenance false:

docker buildx build \
  -t andrewlockdd/alpine-clang:1.0-arm64 \
  -f alpine.build.arm64.dockerfile \
  --platform linux/arm64 \
  --push
  .

and we check the manifest produced using

docker manifest inspect andrewlockdd/alpine-clang:1.0-arm64

Then we can see that the manifest is already a manifest list, or rather, it's an OCI image index:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 675,
      "digest": "sha256:680c4e5a622100dd2b79d3f9da0a6434a150bd250714efcc1f109bce1bdd54e6",
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 566,
      "digest": "sha256:b512226b9d375d0616afa5cbf55405182746af3f8ed4aed6f22899ebc8c02417",
      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }
    }
  ]
}

This is effectively the same as Docker's manifest list, but if you try to combine the resulting docker image using docker manifest create, you'll get an error:

$ docker manifest create andrewlockdd/alpine-clang:1.0 \
  --amend andrewlockdd/alpine-clang:1.0-arm64 \
  --amend andrewlockdd/alpine-clang:1.0-x64

docker.io/andrewlockdd/alpine-clang:1.0-arm64 is a manifest list

Given what we've seen, that makes sense, but it's not a very useful error, as it doesn't explain how to resolve it.

You've already seen one "fix" is to just disable provenance, and fallback to the older docker manifest. The obvious downside is that you lose the provenance attestations!

Building multi-arch images while preserving provenance

Luckily, there's a way to both have your cake and eat it. You can leave provenance enabled and you can combine the images by using the docker buildx imagetools command.

For completeness, the following shows the whole process:

  • Build and push the single-arch images using digests (and including provenance)
  • Merge the images using docker buildx imagetools
  • View the final manifest (and provenance)

First we create the single-arch images. These are the same commands we used previously, but without the --provenance false argument:

# Build and push the x64 image
docker buildx build \
  -t andrewlockdd/alpine-clang \
  -f alpine.build.dockerfile \
  --platform linux/amd64 \
  --output push-by-digest=true,type=image,push=true \
  .

# Build and push the arm64 image
docker buildx build \
  -t andrewlockdd/alpine-clang \
  -f alpine.build.arm64.dockerfile \
  --platform linux/arm64 \
  --output push-by-digest=true,type=image,push=true \
  .

The important final output shows the digests we need:

=> exporting to image                                                                                                     2.5s
 => => exporting layers                                                                                                    0.0s
 => => exporting manifest sha256:baf6ac2adae703a67dd950db4ea643c43adebe0170bf7ce6757cd9738b970b29                          0.0s
 => => exporting config sha256:957c30684ee6d3d5defb646d5b2c19c7a57ebce4a485b1a47743afbc503e2d98                            0.0s
 => => exporting attestation manifest sha256:c1320b886d705acfb178ed5fd913c88e0b03e455b5cd7d7d46f9859cdeb9dd6c              0.1s
 => => exporting manifest list sha256:43e7b94e3c40edb95a4d0519c6e58f592f43a7ccf604b0524f2df5ac888c11bc                     0.0s
 => => pushing layers                                                                                                      1.5s
 => => pushing manifest for docker.io/andrewlockdd/alpine-clang                                                            0.8s

and

=> exporting to image                                                                                                     2.6s
 => => exporting layers                                                                                                    0.0s
 => => exporting manifest sha256:680c4e5a622100dd2b79d3f9da0a6434a150bd250714efcc1f109bce1bdd54e6                          0.0s
 => => exporting config sha256:36d2ae374bbe2d18a4ea8e275224be28d8d834b74fe24aec60a61be11f143e74                            0.0s
 => => exporting attestation manifest sha256:3778285b90c1da66e6489deff990d3eb36a1fd142ab03fcd29441d212371478a              0.1s
 => => exporting manifest list sha256:bf6ebf7b395a92a6c71c052dc443bf0ce90981687b475111327b5195ff3945cb                     0.1s
 => => pushing layers                                                                                                      1.4s
 => => pushing manifest for docker.io/andrewlockdd/alpine-clang                                                            1.0s

There's a lot of sha256 listed here 😅 The ones we need are for the manifest list, which includes both the image manifest and the attestation manifest. We finally use imagetools to create the combined multi-arch manifest:

docker buildx imagetools create \
  -t andrewlockdd/alpine-clang:1.0 \
  andrewlockdd/alpine-clang@sha256:bf6ebf7b395a92a6c71c052dc443bf0ce90981687b475111327b5195ff3945cb \
  andrewlockdd/alpine-clang@sha256:43e7b94e3c40edb95a4d0519c6e58f592f43a7ccf604b0524f2df5ac888c11bc

If we inspect the new generated manifest using manifest inspect, we can see that it's a manifest list using the OCI Image Index format, and it includes four manifests: 2 for each image (an image manifest and an attestation manifest):

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 675,
      "digest": "sha256:680c4e5a622100dd2b79d3f9da0a6434a150bd250714efcc1f109bce1bdd54e6",
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 566,
      "digest": "sha256:3778285b90c1da66e6489deff990d3eb36a1fd142ab03fcd29441d212371478a",
      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 870,
      "digest": "sha256:baf6ac2adae703a67dd950db4ea643c43adebe0170bf7ce6757cd9738b970b29",
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "size": 566,
      "digest": "sha256:c1320b886d705acfb178ed5fd913c88e0b03e455b5cd7d7d46f9859cdeb9dd6c",
      "platform": {
        "architecture": "unknown",
        "os": "unknown"
      }
    }
  ]
}

It's also worth noting that buildx imagetools has its own inspect command, which provides a more compact and readable description of the manifest. For example, if you run

docker buildx imagetools inspect andrewlockdd/alpine-clang:1.0

The output is generally easier to read, and explicitly calls out the attestation manifests:

Name:      docker.io/andrewlockdd/alpine-clang:1.0
MediaType: application/vnd.oci.image.index.v1+json
Digest:    sha256:cb350dbbe9b9faac570357461e2db806c54f8535451829cf6827b5ab0c2d2d53

Manifests:
  Name:        docker.io/andrewlockdd/alpine-clang:1.0@sha256:680c4e5a622100dd2b79d3f9da0a6434a150bd250714efcc1f109bce1bdd54e6
  MediaType:   application/vnd.oci.image.manifest.v1+json
  Platform:    linux/arm64

  Name:        docker.io/andrewlockdd/alpine-clang:1.0@sha256:3778285b90c1da66e6489deff990d3eb36a1fd142ab03fcd29441d212371478a
  MediaType:   application/vnd.oci.image.manifest.v1+json
  Platform:    unknown/unknown
  Annotations:
    vnd.docker.reference.digest: sha256:680c4e5a622100dd2b79d3f9da0a6434a150bd250714efcc1f109bce1bdd54e6
    vnd.docker.reference.type:   attestation-manifest

  Name:        docker.io/andrewlockdd/alpine-clang:1.0@sha256:baf6ac2adae703a67dd950db4ea643c43adebe0170bf7ce6757cd9738b970b29
  MediaType:   application/vnd.oci.image.manifest.v1+json
  Platform:    linux/amd64

  Name:        docker.io/andrewlockdd/alpine-clang:1.0@sha256:c1320b886d705acfb178ed5fd913c88e0b03e455b5cd7d7d46f9859cdeb9dd6c
  MediaType:   application/vnd.oci.image.manifest.v1+json
  Platform:    unknown/unknown
  Annotations:
    vnd.docker.reference.digest: sha256:baf6ac2adae703a67dd950db4ea643c43adebe0170bf7ce6757cd9738b970b29
    vnd.docker.reference.type:   attestation-manifest

Things still look essentially the same on docker hub, so as long as whatever is using your docker image supports the newer OCI image format, then it probably makes sense to include the attestation parameters.

Image of the multi-arch image on docker hub

This post showed one way to build your multi-arch images, but buildx itself also supports farming out building to various native nodes if that's something you are interested in. If you're building your images in GitHub Actions, then it's worth looking at this documentation which shows to easily automate the above steps.

Summary

In this post I briefly described the easy approach to building multi-arch images using buildx and --platforms, which I described in a previous post. Unfortunately, it's not always possible to use this approach, so you may need to build and push your single-arch images, and then combine the resulting images into a single multi-arch image.

In this post I showed how to build single-arch images using buildx with --provenance disabled so that you generate a single manifest per image. I then showed how to combine these images using docker manifest create to create a manifest list, which is a multi-arch image.

Finally, I showed how you can preserve the provenance attestation (which creates OCI Image Lists for each single-arch image) and then combine these using buildx imagetools create to create the final multi-arch image.


Viewing all articles
Browse latest Browse all 743

Trending Articles