Quantcast
Channel: Andrew Lock | .NET Escapades
Viewing all articles
Browse latest Browse all 770
тЖз

Building LaTeX projects on Windows easily with Docker

$
0
0

In this short post I describe how I build LaTeX projects on a Windows machine by using Docker. There's nothing particularly novel or exciting about this, someone just asked me about it recently so this is effectively my reply!

Typesetting documents with LaTeX

I love using markdown for writing documentation, blog posts, notes, or anything I can. Even with a very simple editor, you can have something that roughly looks like the markup that you want to produce. But markdown is generally tied closely to HTML (though this is not strictly required), and sometimes you want or need to produce a standalone document, like a PDF, for which HTML isn't generally well suited.

For those rare cases where I want to produce a nice looking document, such as a CV, I fallback to LaTeX. LaTeX is a markup language that is explicitly about typesetting documents so it cares deeply about pages and margins, which contrasts with HTML markup. You then use a TeX distribution to render the LaTeX document as a PDF (or other format).

LaTeX performs the same general function as WYSIWYG editors like Google Docs or Microsoft Word. However, LaTeX is not WYSIWYG (requiring an exxplicit rendering step) which can make it harder to use in general, but this all can enable broader optimisations and produce better looking documents overall.

LaTeX is very common in the scientific community in particular, with papers often being written in LaTeX. It is also particularly good for writing large documents, which is why I used it to write my PhD thesis. However, that was years ago. When I came to work with LaTeX again recently, I was reminded that working with it on Windows was not obvious.

Rendering LaTeX on Windows

LaTeX comes very much from the *Nix world, which means that it's very open, and available, but also that there are thousand different "suggested" approaches to rending LaTeX documents. For people who want choice, or who care about the little details of each distribution, that's great. For people who just want to "install something" and get on with it, getting started can be confusing ЁЯШЕ

Just to be clear, I am very much a beginner with LaTeX, despite having used it a lot 15 years ago. I still find it has a very steep learning curve (softened somewhat these days by StackOverflow and ChatGPT), but there are some things it just does better than WYSWIG editors, so I lean on it where I can now and again.

When I was working on my PhD, I'm pretty sure I used TeXnicCenter, but it's been a decade since there were any updates there, and you need to manage your own TeX distribution (e.g. MiKTeX) directly if I understand correctly. Other common suggestions include Texmaker, TeXstudio, or just to use VS Code, but these all still require you manage the TeX distribution yourself. Unfortunately I have bad memories of manually managing MiKTeX on Windows and really wanted to avoid that if I could help it.

For a long while, I was using Overleaf, an online-only platform, for occasionally rendering LaTeX documents, and it worked very well for my needs. I was only rendering relatively short documents and didn't particularly need a lot of IDE or collaborative features.

Overleaf

Unfortunately, Overleaf made the individual free-plan basically unusable (rendering a two page document was "too complex"), and I couldn't justify paying a subscription for something I only used a couple of times a year.

Using Docker seemed like the obvious choice to solve my issue: I could use VS Code as my editor (it has built-in syntax support for LaTeX), and then rending the documents using a self-contained distribution.

Overleaf itself is open source, and provides Docker images for running the community edition. I looked into this initially, but it was overkill for what I needed: I didn't need multiple accounts and project storage in the Overleaf implementation itself, I just needed the rendering part.

Building LaTeX with Docker using blang/latex

Eventually I found the blang/latex docker images, which do exactly what I wantтАФan Ubuntu-based image, with everything you need to compile locally in docker. The blang/latex come in three different flavours:

  • blang/latex:ubuntu (Dockerfile)тАФUbuntu TexLive distribution: Old but stable, most needed packages(3.9GB)
  • blang/latex:ctanbasic (Dockerfile)тАФ CTAN TexLive Scheme-basic: Up-to-date, only basic packages, base for custom builds (500MB)
  • blang/latex:ctanfull (Dockerfile)тАФ CTAN TexLive Scheme-full: Up-to-date, all packages (5.6GB)

The more complete versions are obviously quite big images, but as I was only going to be pulling them once locally and running them repeatedly to build, I wasn't particularly worried about that. For simplicity I went with the ctanfull tag which contains the full set of CTAN packages.

The docker hub page describes how you can use these images to build your LaTeX project, but I decided to go with something a bit different. I created two files in the root of my project.

  • build.ps1тАФThis script runs (or restarts) the docker blang/latex container.
  • build.shтАФThis simple script runs inside the docker container, and does the latex build.

The scrcipts themselves are very simple. First we have the build.ps1 script:

# Try to start an existing container, if it exists
docker start -a -i "latex-builder"

if ($LASTEXITCODE -ne 0) {
  # The latex-builder image doesn't exist, so run it explicitly
  $ROOT_DIR="$PSScriptRoot"

  docker run -it `
    --mount "type=bind,source=$ROOT_DIR,target=/data" `
    --name "latex-builder" `
    blang/latex:ctanfull /bin/bash
}

This script attempts to start a docker container using the blang/latex:ctanfull image, using the name latex-builder, and attaches stdin/stdout/stderr. If the container already exists and is stopped, then this just restarts the same image and connects to the container. If the container with the name latex-builder doesn't exist, the script starts a new container, mounts the root directory inside the image at /data, and opens a bash shell.

This doesn't strictly work as I would like, as depending on how you later exit a re-started container, the exit code may be non-null, which causes the script to (harmlessly) try to start a new container, fail, and show an error. It's not a big deal, but it's slightly annoying, and I'm sure I'm missing an obvious solution here!

Once the container is running, you can build by running .\build.sh inside the container. The build.sh script is specific to each project and looks something like this:

#!/bin/sh

lualatex --jobname="My CV"  main.tex

This simple script runs lualatex to render the main.tex LaTeX document as a PDF. Thanks to mounting the directory, this appears on the "Windows side" automatically. I keep the docker container running in a window, iterate on the LaTeX files in an editor, and then when I want to see the changes, quickly flick to the terminal and run build.sh in the container. It's exactly the workflow I was hoping for.

Note that you could also use pdflatex or latexmk instead of lualatex if you prefer, as described in the docs for blang/latex.

Why use two separate files?

You might be wondering why I opted to keep the docker image container running, necessitating both a build.ps1 and build.sh file? It's true that you could run everything all in one command. For example, you could combine the scripts above as follows:

$ROOT_DIR="$PSScriptRoot"

docker run -it --rm`
  --mount "type=bind,source=$ROOT_DIR,target=/data" `
  --name "latex-builder" `
  blang/latex:ctanfull `
  lualatex --jobname="My CV"  main.tex

The problem with this approach is that it starts a new container every time. But when you first run luatex (and, I assume pdflatex and others), it performs some one-time setup, similar to the following

root@67d3d81c4129:~# ./build.sh
This is LuaTeX, Version 1.0.4 (TeX Live 2017)
 restricted system commands enabled.
(./main.tex
LaTeX2e <2017-04-15>
(using cache: /usr/local/texlive/2017/texmf-var/luatex-cache/generic)
luaotfload | main : initialization completed in 0.362 seconds
Babel <3.16> and hyphenation patterns for 1 language(s) loaded.
(./example.cls
Document Class: example 2013/02/09 v1.3.0 example class
(/usr/local/texlive/2017/texmf-dist/tex/latex/base/size11.clo
luaotfload | db : Font names database not found, generating new one.
luaotfload | db : This can take several minutes; please be patient.

As you can see, on this "first run", luatex has to build the font-name database cache. On my old laptop this only takes 5-10s, but it's an annoying extra delay for the "inner-loop" experience, where the remainder of the build only takes a couple of seconds. By reusing an existing container we can skip that initialization entirely, and go straight to the build!

That said, if you want to also have a build that runs in CI, then the one-shot approach obviously makes sense. This is effectively what the GitHub actions yml file below does, by calling ./build.sh directly instead of starting a shell:

name: Build
on:
  push:
    branches: [ "main" ]
  pull_request:
  workflow_dispatch:

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: build in docker
        run: docker run --mount type=bind,source="${PWD}",target=/data $IMAGE /data/build.sh
        env:
          IMAGE: blang/latex:ctanfull
      - uses: actions/upload-artifact@v4.4.3
        with:
          name: pdf
          path: 'My CV.pdf'

Overall I find this setup gives me the best of all worlds, but the real win here is not having to worry about managing dependencies or Windows TeX distributions. Instead I lean on people who are more knowledgeable about these things package things neatly in a Docker image for me!

What's in the dockerfile?

Whenever I'm using random docker images like this, I like to take a look at the Dockerfile to see what's going on under the hood. The blang/latex:basic image contains most of the setup:

FROM ubuntu:xenial
MAINTAINER Benedikt Lang <mail@blang.io>
ENV DEBIAN_FRONTEND noninteractive

# Add the prerequisite packages
RUN apt-get update -q \
    && apt-get install -qy build-essential wget libfontconfig1 \
    && rm -rf /var/lib/apt/lists/*

# Install TexLive with scheme-basic
RUN wget http://mirror.ctan.org/systems/texlive/tlnet/install-tl-unx.tar.gz; \
	mkdir /install-tl-unx; \
	tar -xvf install-tl-unx.tar.gz -C /install-tl-unx --strip-components=1; \
    echo "selected_scheme scheme-basic" >> /install-tl-unx/texlive.profile; \
	/install-tl-unx/install-tl -profile /install-tl-unx/texlive.profile; \
    rm -r /install-tl-unx; \
	rm install-tl-unx.tar.gz

ENV PATH="/usr/local/texlive/2017/bin/x86_64-linux:${PATH}"

ENV HOME /data
WORKDIR /data

# Install latex packages
RUN tlmgr install latexmk

VOLUME ["/data"]

There's not loads going on there: from an Ubuntu 16.04 base image (which is getting outdated, and probably needs updating soon) the image installs prerequisites like wget and build-essential, and then installs a TexLive distribution, and adds it to the path.

The blang/latex:ctanfull image that I use in this post, is based on this image, but uses the "full" scheme:

FROM blang/latex:ctanbasic
MAINTAINER Benedikt Lang <mail@blang.io>

RUN tlmgr install scheme-full

I might think about trying to create updated versions of these images, but honestly, as long as they keep working, I probably won't ЁЯЩИ

The local editor experience

For the local editing experience, I've found that VS Code is good enough for me, and I'm obviously very familiar with it. VS Code has LaTeX support built-in, so I simply added a PDF Viewer extension so that I can have the rendered result open in on the right side of the Window while editing on the left, and a terminal open at the base. Add in a spell-checker and there's not much more I need for local development:

An image of my local dev setup using VS Code

That covers my local development setup, my build process, and the simple CI that I have for projects. It's easy to copy the build.ps1, build.sh, and build.yml files between projects whenever I need to. And if I come back to it 6 months later, it's obvious what I need to doтАФrun the build.* script that makes sense (.ps1 for Windows and .sh once running inside the docker container)!

Summary

In this post I described how I render LaTeX projects using the blang/latex docker image, which contains a full LaTeX distribution. I described how and why I start a long-lived docker container for the project, and then repeatedly re-render the project using lualatex. Finally I showed the setup I use with VS Code, a PDF rendering extension, and a spellchecker.

тЖз

Viewing all articles
Browse latest Browse all 770

Trending Articles