
In this post I provide the solution to a very specific situation in GitLab:
- You have a job that calculates various variables for the current pipeline.
- You want to use those variables to decide whether to run a subsequent job or not.
It turns out this is a lot harder to do than I expected when I started trying to do it…
The setup: conditionally running jobs when doing a release
I was recently working on a problem at work, working on the CI and release process for the Datadog .NET Client library. As part of the release purpose we take some pre-built Docker images and push them to our public Docker repositories.
Previously we were tagging those public images with one or two tags, depending on the release
vMajor.Minor.Patch
(for examplev1.2.3
orv2.3.5-prerelease
).latest
(if the tag was not a prerelease).
The code and logic for doing the retagging was encapsulated in a GitLab pipeline in a separate repository, so our original GitLab pipeline was pretty simple. We only run this when doing a release, in which we set a git tag on the repository (which appears in the CI_COMMIT_TAG
variable). The whole pipeline looked a bit like this:
stages:
- deploy
deploy_major_minor_patch:
stage: deploy
rules:
- if: '$CI_COMMIT_TAG =~ /^v[0-9]+\.[0-9]+\.[0-9]+(-prerelease)?$/'
when: always
trigger:
project: DataDog/public-images
branch: main
strategy: depend
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:$CI_COMMIT_SHA
IMG_DESTINATIONS: dd-lib-dotnet-init:$CI_COMMIT_TAG # Use the git tag as the docker tag
IMG_SIGNING: "false"
deploy_latest:
stage: deploy
rules:
# Don't deploy latest tag if prerelease
- if: '$CI_COMMIT_TAG =~ /^v[0-9]+\.[0-9]+\.[0-9]+$/'
when: always
trigger:
project: DataDog/public-images
branch: main
strategy: depend
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:$CI_COMMIT_SHA
IMG_DESTINATIONS: dd-lib-dotnet-init:latest # Fixed docker tag
IMG_SIGNING: "false"
These two jobs are virtually identical. The only differences are:
deploy_major_minor_patch
- Runs on any tag of the format
vMajor.Minor.Patch(-prerelease)
- Uses the commit tag as the docker tag
- Runs on any tag of the format
deploy_latest
- Only runs on tags of the format
vMajor.Minor.Patch
. Does not run on-prerelease
tags. - Uses the fixed value
latest
as the docker tag
- Only runs on tags of the format
We recently decided we wanted to also push vMajor
and vMajor.Minor
tags, so that it's possible to pull the latest version of a specific major version. This is common practice for most docker repostories. For example the .NET Runtime docker containers add all the following tags to the same "latest" Debian 12 image:
8.0.3-bookworm-slim-amd64
8.0-bookworm-slim-amd64
8.0.3-bookworm-slim
8.0-bookworm-slim
8.0.3
8.0
latest
On the face of it this seemed like an easy requirement, it was only when it came to implementing it that I realized it's harder to codify than I originally thought.
Calculating the required tags for a release
The difficultly in the requirements is that you can't calculate the required tags based solely on the current release tag. The tags that you add to a release now depend on which other releases have already happened.
For example, if I release version 2.3.1
then we're always going to need the vMajor.Minor.Patch
tag. Similarly, if we assume our patch numbers are monotonically increasing, then we know that each new release will need the vMajor.Minor
tag (v2.3
) too. But what about the latest
and v2
tags? We can't tell if we need those based solely on this release version. Has there already been a 2.5.0
release? What about a 3.0.0
release?
Thinking this through made me release our current GitLab pipeline was also fundamentally broken, as it assumed that every non-prerelease version was the
latest
, which is not the case! Luckily, I don't think we've actually been caught out by this in practice yet.
Thinking this through we come up with the following rules:
- Every release gets the
vMajor.Minor.Patch
version tag (with optional-prerelease
suffix)2.5.0
getsv2.5.0
2.2.3
getsv2.2.3
1.2.1
getsv1.2.1
- Every release gets the
vMajor.Minor
version tag initially (which assumes we never "go back" in release values)2.5.0
getsv2.5
2.2.3
getsv2.2
1.2.1
getsv1.2
- Some releases get the
vMajor
version tag. Only releases for which this is the highest version in the major get the tag.2.5.0
getsv2
(if there's no higher2.x.x
release)1.2.1
getsv1
(if there's no higher1.x.x
release)
- Some releases get the latest tag. Only releases for which this is the highest version ever get the tag.
2.5.0
getslatest
if it's the highest release so far1.2.1
will not getlatest
if there's already2.x.x
releases
So we're now in a situation where we need to check what releases we already have, and we need to do that in a script, because we're going to have to fetch the releases, sort them, compare our tags etc. I'm not going to show that logic here, as this post is long enough as it is, but ultimately we end up with some bash variables in a script, and the question is how we can use them to solve our problem.
Approaches that don't work
Before we get to the final solution, I'm going to describe some of the approaches I tried, and why they didn't work. That might seem unnecessary, but I think it's instructive to understand the approaches, as they may work for other situations.
Using variables from other jobs in rules
patterns is not possible
In my first attempt, I was ignoring the "conditional" part, and wanted to see if I could pass variables from one job to another. My approach was:
- Add a new job
generate-tag-values
that generates the required tag values - Store the tag values in artifacts
- Load the artifacts in the dependent trigger jobs, one for each image
I found the solution to this in a StackOverflow answer (and subsequently in the documentation). The answer is to write to the build.env
file, export that as an aritfact, and then import that artifact in the subsequent job.
So the pipeline looks something like this:
stages:
- deploy
generate-tag-values:
stage: deploy
rules:
# Don't deploy major/major.minor/latest on prerelease
- if: '$CI_COMMIT_TAG =~ /^v[0-9]+\.[0-9]+\.[0-9]+$/'
when: always
script:
# Extract the vMajor.Minor and vMajor from the tag
- MAJOR_MINOR_VERSION="$(sed -nE 's/^(v[0-9]+\.[0-9]+)\.[0-9]+$/\1/p' <<< ${CI_COMMIT_TAG})" # vMajor.Minor
- MAJOR_VERSION="$(sed -nE 's/^(v[0-9]+)\.[0-9]+\.[0-9]+$/\1/p' <<< ${CI_COMMIT_TAG})" # vMajor
# Save the values in build.env
- echo "MAJOR_MINOR_VERSION=${MAJOR_MINOR_VERSION}" >> build.env
- echo "MAJOR_VERSION=${MAJOR_VERSION}" >> build.env
# TODO: calculate these and set as appropriate
- echo "IS_LATEST_MAJOR_TAG=1" >> build.env
- echo "IS_LATEST_TAG=1" >> build.env
artifacts:
reports:
dotenv: build.env # export build.env as an artifact
deploy_major:
stage: deploy
rules:
- if: '$CI_COMMIT_TAG =~ /^v[0-9]+\.[0-9]+\.[0-9]+$/'
when: always
trigger:
project: DataDog/public-images
branch: main
strategy: depend
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:$CI_COMMIT_SHA
IMG_DESTINATIONS: dd-lib-dotnet-init:$MAJOR_VERSION # Use the version from generate-tag-values
IMG_SIGNING: "false"
# needs the version from the generate-tag-values job
needs:
- job: generate-tag-values
artifacts: true
deploy_major_minor:
#... very similar ommited to save space
This works, but we're still always deploying the vMajor
tag, regardless of whether we should be. We're not using the IS_LATEST_MAJOR_TAG
or IS_LATEST_TAG
variables anywhere in the downstream job. The downstream project doesn't have a variable we can pass to say "don't run me" (and it wouldn't really make sense if it did).
So ultimately we want to control whether a trigger job executes based on a variable that's defined in a previous job.
tl;dr; You can't 😢
Ideally I would have been able to do something like this, in which we use the IS_LATEST_MAJOR_TAG
variable from the generate-tag-values
stage:
deploy_major:
stage: deploy
rules:
# only run if IS_LATEST_MAJOR_TAG = 1
- if: IS_LATEST_MAJOR_TAG == 1' && '$CI_COMMIT_TAG =~ /^v[0-9]+\.[0-9]+\.[0-9]+$/
Unfortunately, the rules
are evaluated before the environment is loaded from the previous job, so this will always fail. It turns out that this is a dead-end.
Using the trigger API
One of the difficulties is that the downstream job is a trigger job that automatically runs another pipeline. If the job was a script, we could add a step to the start of the script that checks the IS_LATEST_MAJOR_TAG
variable, and exits early if it does.
Well, we kind of can do this. There's an HTTP API you can use to trigger pipelines which is mentioned in the documentation. By turning the trigger job into a "normal" job, we can give ourselves another option:
deploy_major:
stage: deploy
rules:
- if: '$CI_COMMIT_TAG =~ /^v[0-9]+\.[0-9]+\.[0-9]+$/'
when: always
script:
# if we're not supposed to be tagging, then exit early
- if [ "$IS_LATEST_MAJOR_TAG" -ne 1] then; return; fi;
# trigger the pipeline using the API
- |
curl -sS --fail-with-body \
--request POST \
--form "token=$CI_JOB_TOKEN" \
--form "ref=main" \
--form "variables[IMG_SOURCES]=$IMG_SOURCES" \
--form "variables[IMG_DESTINATIONS]=$IMG_DESTINATIONS" \
--form "variables[IMG_SIGNING]=$IMG_SIGNING" \
"https://gitlab.ddbuild.io/api/v4/projects/DataDog%2Fpublic-images/trigger/pipeline"
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:$CI_COMMIT_SHA
IMG_DESTINATIONS: dd-lib-dotnet-init:$MAJOR_VERSION # Use the version from generate-tag-values
IMG_SIGNING: "false"
# needs the version from the generate-tag-values job
needs:
- job: generate-tag-values
artifacts: true
This almost works. But there's a couple of issues:
- We can't set
strategy:depend
when calling the API, which means we lose visibility if the downstream job fails. - The job always runs, and is marked as success even if we bail out early.
- There's a feature request to mark a job as skipped by exit code, but it's not possible currently.
- Moving to
before_script
doesn't help, as we would need to use an error exit code, marking the job as failed, which again isn't really correct.
These limitations led me to look further, and I finally discovered the solution to my problem: dynamic child pipelines.
Using dynamic child pipelines to control trigger jobs
Dynamic child pipelines involve having a job that emits a GitLab YAML file as an artifact, and having a downstream job invoke this pipeline. As you can dynamically generate the pipeline, you can add or remove jobs as you see fit. The example linked in the documentation uses Jsonnet to generate the pipeline, but you can use anything you like to create the file.
As I was already pretty deep into the bash scripting, I decided to give it a try using HEREDOC and the end result wasn't too bad.
With this approach, our .gitlab-ci.yml file becomes a lot simpler, with just two jobs. The first job runs a bash script to calculate the tag names, which tags to add, and then generates a file called generated-config.yml. The second job simply executes this file
stages:
- deploy
# First job generates the generated-config.yml file
generate-tag-values:
stage: deploy
rules:
- if: '$CI_COMMIT_TAG =~ /^v[0-9]+\.[0-9]+\.[0-9]+(-prerelease)?$/'
when: always
script:
- ./generate-tags.sh
artifacts:
paths:
- generated-config.yml # save the generated config as an output
# Second job executes the dynamic pipeline
deploy-lib-init-trigger:
stage: deploy
needs:
- generate-tag-values
trigger:
include:
- artifact: generated-config.yml
job: generate-tag-values
strategy: depend
All the work now happens in generate-tags.sh, which is shown below. Note that I've ommitted all the error checking, along with the code that shows how to calculate the values as I'm focused on the pipeline behaviour in this post.
#!/bin/bash
set -e
# TODO: Calculate all these values
echo "Calculated values:"
echo "This tag: $CI_COMMIT_TAG" # v1.32.0
echo "MAJOR_MINOR_VERSION=${MAJOR_MINOR_VERSION}" # v1.32
echo "MAJOR_VERSION=${MAJOR_VERSION}" # v1
echo "IS_LATEST_TAG=${IS_LATEST_TAG}" # 0 (e.g. we already have v2.0.0)
echo "IS_LATEST_MAJOR_TAG=${IS_LATEST_MAJOR_TAG}" # 1
echo "IS_PRERELEASE=${IS_PRERELEASE}" # 0
echo "---------"
# Helper functions for building the script
add_stage() {
STAGE_NAME="$1"
DEST_TAG="$2"
cat << EOF >> generated-config.yml
deploy_${STAGE_NAME}_docker:
stage: trigger-public-images
trigger:
project: DataDog/public-images
branch: main
strategy: depend
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:$CI_COMMIT_SHA
IMG_DESTINATIONS: dd-lib-dotnet-init:$DEST_TAG
IMG_SIGNING: "false"
EOF
}
# Generate the pipeline for triggering child jobs
cat << EOF > generated-config.yml
stages:
- trigger-public-images
EOF
# We always add this tag, regardless of the version
add_stage "major_minor_patch" $CI_COMMIT_TAG
# If this is a pre-release version, we never add any other stages
if [ "$IS_PRERELEASE" -ne 1 ]; then
# All non-prerelease stages get the major_minor tag
add_stage "major_minor" $MAJOR_MINOR_VERSION
# Only latest-major releases get the major tag
if [ "$IS_LATEST_MAJOR_TAG" -eq 1 ]; then
add_stage "major" $MAJOR_VERSION
fi
# Only latest releases get the latest tag
if [ "$IS_LATEST_TAG" -eq 1 ]; then
add_stage "latest" "latest"
fi
fi
# All finished - print the generated yaml for debugging purposes
echo "Generated pipeline:"
cat generated-config.yml
When this script executes, the generated-config.yml file looks something like the following (example shown below for 1.32.0
, which is the latest 1.x.x
release, but we already have 2.x.x
releases)
stages:
- trigger-public-images
deploy_major_minor_patch_docker:
stage: trigger-public-images
trigger:
project: DataDog/public-images
branch: main
strategy: depend
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:3188ca6f05513f40e18b66c43cac5849ecd8c904
IMG_DESTINATIONS: dd-lib-dotnet-init:v1.32.0
IMG_SIGNING: "false"
deploy_major_minor_docker:
stage: trigger-public-images
trigger:
project: DataDog/public-images
branch: main
strategy: depend
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:3188ca6f05513f40e18b66c43cac5849ecd8c904
IMG_DESTINATIONS: dd-lib-dotnet-init:v1.32
IMG_SIGNING: "false"
deploy_major_docker:
stage: trigger-public-images
trigger:
project: DataDog/public-images
branch: main
strategy: depend
variables:
IMG_SOURCES: ghcr.io/datadog/dd-trace-dotnet/dd-lib-dotnet-init:3188ca6f05513f40e18b66c43cac5849ecd8c904
IMG_DESTINATIONS: dd-lib-dotnet-init:v1
IMG_SIGNING: "false"
Note that we don't have a job for tagging :latest
in here, because IS_LATEST_TAG=0
in this example, so we just didn't emit the job. When we execute the pipeline, it works exactly as we hoped, with the downstream jobs being executed:
The big downside to this approach is that it makes it harder to understand what's going on just from looking at the parent project .gitlab-ci.yml file, as the logic is partially encoded in the bash script. But on the other hand, it was the only way I could originally find to get the job done.
As it turns out, I subsequently realised we could take a simpler approach again thanks to the downstream "tagging" project allowing passing a comma-separated list of destinations. Given that we always generate at least one tag, that meant I could go back to the simpler build.env approach. But you might not be so lucky!
Summary
In this post I described a situation in which I wanted to use the result of a previous job in GitLab to control whether a subsequent job is skipped. Unfortunately, most of the "native" ways to do this inside the GitLab yaml have limitations or missing features that make this impossible.
The solution I settled on was to use dynamic pipelines. With dynamic pipelines, you emit a YAML file from a previous job, and then a subsequent job executes that YAML as a child pipeline. This gives you complete control over what to execute and allowed me to solve my problem, though it also makes it harder to understand your pipeline at a glance before it runs.