SRE/Software Engineer hybrid

Automating Gentoo Package (ebuild) Upgrades

Automating Gentoo Package (ebuild) Upgrades
Photo by Derek Oyen / Unsplash

Yes, for some reason, I still run Gentoo as my distribution of choice. Ever since I first ran it as a young, weary eyed 17-year old with way too much time on my hands, I've continued to come back to it. While I started with Ubuntu 12.04, and moved on to Arch Linux eventually (Antergos, to Manjaro, to Arch Linux vanilla). I found myself gravitating to Gentoo Linux for an interesting reason: as of 2015 the best Linux dual boot guide was on Gentoo. Sakaki's EFI Install Guide (Sakaki, if you're reading this, I miss the work you did!). It's a stark contrast to the generally unhelpful community I experienced with Arch Linux. Ever since then, I've always gravitated back to Gentoo. Whether it's the flexible and simple ebuild system, the great package manager, or in general being able to modify all packages any way I want, I love it.

Jared loves Gentoo, so...?

Putting that all aside, I wanted to take the time to write up a quick post showing off a system I've created for Gentoo ebuild management. It's no secret that Gentoo isn't the most popular distribution out there. That being said, generally speaking packages are well maintained and updated upstream, by they tend to be marked as unstable (e.g., ~arm64) or don't have that version even available at all. So, for that reason, a lot of users end up running their own overlays, which contain their own ebuild (package definitions, which are very very similar to PKGBUILDs, for those coming from Arch). I am no exception! I maintain my own overlay repository (jaredallard-overlay). However, I tend to distro hop and I've turned into an Apple Silicon purist (they are the best laptops! You can't argue that!). So, given the relative infancy of the Asahi Project, I tend to not be booted into that partition all that often. My packages tend... to be out of date quickly. This isn't great. So, I decided to automate it.

Automation

Now, you might be thinking "How does automating this actually solve anything? Won't you have a ton of stability problems?". Well, you'd be right. It doesn't ensure the stability of my system at all, but it does allow me to upgrade and downgrade to whatever version I want, and blindly update (let's be real, most of us just run sudo apt full-upgrade anyways. If you're meticulously checking every version/package, good on you). So, I trekked onwards anyways.

Updating a package (the ebuild)

Generally speaking, upgrading a package is pretty easy if you have an existing ebuild. Every ebuild I've seen takes advantage of specific environment variables to determine which source URLs to use. This allows us to update them by simply... renaming them. Yes, it's that simple. Take, for example, app-arch/7-zip on my overlay. If I rename the (at the time of writing) 7-zip-23.01.ebuild to be 7-zip-23.02.ebuild, and regenerate the manifest using ebuild <file> manifest, I can now install the mythical 23.02 version of 7-zip! Wow! Amazing! Almost all package's ebuilds are like this (we'll address some other special cases later), so, I set off on building a system to just... copy a file every time the remote version changes.

How do we get the latest version?

The first problem I had to solve was determining the latest version itself. I ended up supporting two paths based on packages I had in my overlay. Almost everything has a Git repository with tags, so we can easily just determine the latest tag based on semver/date. One package I had (ironically, given I work here) did not have a repo because it's closed source. So, I built in APT repository version parsing as another option. That left me with a total of two resolves:

  • Git repositories (tags, or latest commit sha)
  • APT repositories (for a given sources.list entry, use the latest available version)

The Git implementation was largely unremarkable, while the APT repository logic was actually a little challenging. I won't go into too much detail there, but lets just say it's all about control files. With those resolvers in place, I could build the rest of the functionality in! Copying a file, nice.

Laying the foundations

I really wanted to build a declarative system with flexibility for custom ebuild update scenarios, as I had a particularly difficult ebuild (that I'll get into more detail about later) that I needed to support. So, I ended up creating a packages.yml file that is just a map of packages with options. Below is a short example of it:

app-arch/7-zip:
  resolver: git
  options:
    url: https://github.com/ip7z/7zip

By default, the action to take would be just copying the existing ebuild to a new file that contains the new version in the file name and regenerating the manifest. For those curious, I ended up building an ebuild parser to determine version information, among other things. Easy process, right...?

More difficult ebuilds

Going back to the... more difficult ebuild I mentioned earlier. Sometimes you need to replace things inside of the ebuild itself, or some other action to be taken. So, I ended up building a slightly over complicated executor system, running inside of Docker. The idea is that the container would allow me to ensure I can run it anywhere, even not on a Gentoo system, as well as prevent host mutation. The executor system would grant me flexibility. This would be defined as steps inside of the yaml. Let's take the working example of what I had to do to automate the updating of dev-util/mise:

dev-util/mise:
  resolver: git
  options:
    url: https://github.com/jdx/mise

  # We have to regenerate the ebuild to get new crates and licenses to
  # be reflected, so we have to have custom steps.
  steps:
    - checkout: https://github.com/jdx/mise
    - original_ebuild: mise.ebuild
    - command: pycargoebuild -i mise.ebuild
    - ebuild: mise.ebuild

By default, steps is set to just [{ original_ebuild: original.ebuild }, { ebuild: "original.ebuild" }]. This just copies the ebuild, as mentioned earlier, but if you set custom "steps", you can change this logic. Here we can see a few different steps, that have their own logic:

  • checkout - Clone the repository using the latest version's revision (originally implemented as just a command step shown later)
  • original_ebuild - Takes the ebuild found and copies it into the container at that path.
  • command - Runs a command in the container
  • ebuild - Tells the updater to use the provided path as the final output, saving it on the host and regenerating the manifest with it.

Using this system allows us to introduce flexibility into the updater process at any step and helped me solve tough ebuilds like this one!

Final boss: Go dependencies

I've been using this system for months now. However, today, I ran into a Go project that I've been trying to avoid for awhile now. Now, you're probably asking "What? Jared, avoiding Go? That's all he writes, wtf". Yeah, I really did avoid it. That's because Gentoo ebuilds are actually super annoying to write for Go code bases. While rust ebuilds use crates.io for dependencies, for some reason that is not used for Go packages. Instead, dependency tarballs are created. These tarballs consist of all of the go modules required to build the project. These then have to be stored somewhere and referenced in your ebuild as a source archive. The key issue is stored somewhere, which is something my super amazing container system doesn't support!

While my system didn't support it, I already have a well established Gentoo resource(?) server that I store Gentoo related things in: binhosts and a full package mirror. This is stored in R2, so I decided it was time to implement artifact uploading support into the updater! I took the net-vpn/tailscale ebuild that I wanted to update through my tool and crafted a new packages.yml entry for it:

net-vpn/tailscale:
  resolver: git
  options:
    url: https://github.com/tailscale/tailscale

  # We have to generate a Go dependency archive and upload it to a
  # stable location, so we do that during this process.
  steps:
    - checkout: https://github.com/tailscale/tailscale
    - original_ebuild: new.ebuild
    - command: |-
        set -euxo pipefail

        GO_VERSION=$(grep "^go" go.mod | awk '{ print $2 }' | awk -F '.' '{ print $1"."$2}')
        mise use -g golang@"${GO_VERSION}"

        # Create the dependency tar.
        GOMODCACHE="${PWD}"/go-mod go mod download -modcacherw
        tar --create --file deps.tar go-mod
        xz --threads 0 deps.tar

        # Get the shell variables and rewrite the ebuild to contain
        # them.
        eval "$(./build_dist.sh shellvars)"
        sed -i 's/VERSION_MINOR=".*"/VERSION_MINOR="'"${VERSION_MINOR}"'"/' new.ebuild
        sed -i 's/VERSION_SHORT=".*"/VERSION_SHORT="'"${VERSION_SHORT}"'"/' new.ebuild
        sed -i 's/VERSION_LONG=".*"/VERSION_LONG="'"${VERSION_LONG}"'"/' new.ebuild
        sed -i 's/VERSION_GIT_HASH=".*"/VERSION_GIT_HASH="'"${VERSION_GIT_HASH}"'"/' new.ebuild

        sed -i 's|dev-lang/golang-.*|dev-lang/golang-${GO_VERSION}|' new.ebuild
    - upload_artifact: deps.tar.xz
    - ebuild: new.ebuild

This required a few changes:

  • I added mise (a version manager) to the executor docker image to allow JIT build tool downloading. We use it to install Golang and use it.
  • Added a .updater.yml configuration file to store tool wide config (S3 location!)
  • Added a upload_artifact step that takes the provided file and uploads it to a predictable path in the configured S3 bucket

With that all added, I was finally able to get Tailscale to automatically update, thus solving all of my current "automatic update problems" 🎉

Retrospection

Whew, that was a lot! Hopefully this was an entertaining read. If you'd like to use my overlay, feel free to do so! Feel free to check out the complete updater source code if you're curious about what I've done. If you have any questions or suggestions, feel free to leave a comment or reach out to me on the issues page!

Want emails when I post?

No spam, no sharing to third party. Only you and me.

Member discussion