Custom Repository for Linux Packages

Timeframe:
1 week (June 2026)
Role:
Architect and sole implementer
Org Context:
Desktop app with β‰ˆ25,000 MAU across all major Linux distros
Stack:
Ansible, Docker, Bash, aptly, createrepo_c, Flatpak
package

Self-hosted single-origin repository that serves one application as deb, rpm, and flatpak with a single always-on service and a strict two-key signing model, deployable to a fresh host with one Ansible command.

1. Context

I have a pet project: an Electron application for Linux that grew over two years to roughly 25,000 monthly active users across all channels. It was distributed mainly as an AppImage. As the userbase grew, some users started to hit the limits of a single-format approach.

The awkward truth about Linux packaging is that there is no truly universal format, despite all attempts at creating it. Different distribution families handle dependencies differently. That leads to the ecosystem being so rich and varied, but it also means broad coverage requires shipping several formats rather than one. To reach most users, you realistically need five: AppImage, DEB, RPM, Snap, and Flatpak.

Two of those five were already well-served. AppImage has no central repository. You distribute it as a downloadable file from your own site, and it carries its own auto-update mechanism. Snap, served through Snapcraft.io, is close to fully automated and needs little human involvement. The gap was the other three. DEB, RPM, and Flatpak had no centralized, auto-updating channel.

The obvious fix is a centralized managed repository, where the real win for users is automatic updates through their native package manager. But the existing hosted options do not always fit. PPA, COPR, and Flathub each come with policy constraints around source-code openness, licensing, or AI usage, and any of those can rule a service out for a given project. Each one also asks you to hand over your signing identity to a third party.

That leaves a concrete question. Can you self-host a centralized repository for DEB, RPM, and Flatpak yourself, covering the gap those three formats leave, without surrendering control of your signing identity or babysitting a fleet of services? This case study is the answer.

2. Constraints and non-goals

3. The decision / approach

The decision reduces to a single fork. If you want object storage plus a CI job that generates the repo plus nginx to serve it, then lightweight static-metadata generators like aptly and createrepo_c are sufficient and often preferable. If you want an internal package platform with governance, meaning multiple repos, controlled promotion, upstream mirroring, and auditability, then Pulp 3 fits better.

The trade-off is operational simplicity versus built-in governance and promotion semantics. For distributing a handful of package versions of one app, the governance machinery is weight you would carry but never use, so the lightweight path wins.

The same logic applies to self-hosting at all rather than reaching for a hosted service. Self-hosting earns its keep when you want to control retention and package history, publish for multiple distributions or release channels from one pipeline, avoid vendor lock-in, or tightly control resource usage. It also wins when you have three ecosystems and one app.

I considered three alternatives before settling on the lightweight stack.

4. Implementation highlights

One note before the details. When I started, I expected Flatpak to require a living daemon for its API and build backend. It turned out the whole thing could be a fully static repo served by nginx, and that discovery reshaped the architecture below.

One service, everything else on demand

The whole system rests on a single observation. A deb, rpm, or flatpak repository is just a tree of static files plus some index metadata. Nothing needs to be running to serve it. The tools that build those indexes, namely aptly, createrepo_c, and flatpak/ostree, are one-shot metadata generators. They read the packages on disk, write out index files, and exit. They only need to run when the repo changes, not while it is being served.

So only one process stays up. That is nginx, serving three sibling directory trees over HTTP. Everything else runs on demand, writes into the same tree nginx serves, and exits.

ALWAYS RUNNING                    ON-DEMAND (run, write files, exit)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    nginx     β”‚  serves  ◄────── β”‚  aptly         (rebuild apt index) β”‚
β”‚ static files β”‚   reads          β”‚  createrepo_c  (rebuild rpm index) β”‚
β”‚ + ACME chal. β”‚                  β”‚  flatpak       (rebuild ostree)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β”‚  certbot       (issue/renew TLS)   β”‚
                                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

It could be implemented with Docker Compose profiles. In docker-compose.yml, only nginx is a plain service. The four on-demand containers carry profiles: ["tools"], which keeps them out of docker compose up entirely. They are built but never started as daemons. Each one runs via docker compose run --rm <svc>, wrapped by the publish and TLS scripts. The rule that keeps the architecture is simple: never turn a publisher into a long-running service.

TLS fits the same shape. certbot is just another on-demand container. It runs to obtain or renew the Let’s Encrypt certificate via the HTTP-01 challenge (which nginx serves from a shared webroot), then exits. The cert lands in a shared volume nginx reads. There is no persistent ACME agent.

Flatpak is a bit different from the rest. Flatpak’s official multi-build server, flat-manager, is a long-running daemon, and adopting it would have added the one persistent service the whole design avoids. For a single-app push model, plain flatpak build-update-repo is one-shot exactly like createrepo_c, so that is what I used. flat-manager earns its keep only with token-authenticated multi-publisher uploads, build queuing, or delta generation at scale, none of which a single-app channel needs.

The two-key trust model

With the serving model settled, the next problem was trust. There are two distinct signing operations, and conflating them is the common mistake.

Package signingRepository signing
What is signedThe individual .deb / .rpm / flatpak commitThe repo’s index metadata (Release, repomd.xml, ostree summary)
Answersβ€œWho built this package, and is it untampered?β€β€œIs this the authentic index of the repo?”
RolePackage maintainer (possibly many)Repository owner (exactly one)
WhereIn CI, at build timeOn the server, when the index is regenerated
Key livesWith the maintainer / in CI secretsOn the repo server only, chmod 600

Keeping the keys and steps separate has concrete payoffs. A second package maintainer can be added with their own key without touching the repo’s signing identity. A compromise of a maintainer’s CI key does not compromise the index signature, and the reverse holds too. The discipline is absolute in both directions. Package signing never happens on the server, and index signing never happens in CI.

The two keys are protected differently, because they face different leaks. The maintainer key is passphrase-protected, and the passphrase is held as a separate CI secret injected only at signing time. This does not help if the whole CI secret store is read at once, since both would leak together. What it defends against is partial exposure, where the key blob alone escapes through a build artifact, a log line, an accidental commit, or a stray backup. In those cases the passphrase is still missing and the leaked blob is inert. The repo key cannot benefit from this. It signs the index unattended on the server on every publish, so any passphrase would have to sit on the same host beside it and be readable by the same automation, which is the same blast radius and also breaks unattended ostree summary signing. So it is intentionally passphrase-less and protected instead by filesystem permissions, host hardening, and a separate encrypted backup.

The server is the enforcement point

Client capabilities for per-package signatures differ sharply by format. apt, for instance, does not verify per-package signatures at all. So I enforce the maintainer signature on the server, before anything is indexed. CI emits a detached, armored signature alongside each package. The publish scripts verify it against the maintainer public key and fail closed if it is missing or made by the wrong key. Without this gate, anyone who could write a file into the drop-zone could get arbitrary content signed into the index.

The publisher containers also import the repo private key in order to sign the index, so a plain gpg --verify would happily accept a package signed by the repo key too, which silently defeats the whole split. So verification runs in a throwaway GnuPG home that holds only the maintainer public key, then asserts the signer fingerprint matches it. A related gotcha drives another rule: always select keys by fingerprint, never by email. The two keys may share an email, and an email selector lets gpg grab the wrong one, which then fails unattended index signing in a cryptic way.

Unprivileged upload, privileged publish

CI should be able to push packages without holding a shell, Docker access, or write access to the deploy root. The design splits the two responsibilities cleanly. CI is an unprivileged, SFTP-only, chrooted publish user that can do exactly one thing: drop files into an incoming directory and write a trigger marker. A systemd path unit watching that marker runs the publish pipeline as root. CI never touches the running system directly.

The drop-zone deliberately lives outside the deploy root. sshd refuses to chroot a user unless every parent of the chroot is root-owned, but the deploy root is owned by the deploying user. So the incoming directory gets its own root-owned location rather than bending the deploy root’s ownership to fit.

Because a root-run pipeline is consuming files that an unprivileged user controls, the publish script applies several defensive controls before it reads or writes anything.

There is also an asynchronous-completion obstacle worth calling out, because it is what makes the split practical to operate. The re-index runs out of band, and CI has no shell on the host, but CI still needs to know when publishing actually finished rather than just when the upload landed. So CI stamps a unique run id into the trigger marker, and the pipeline echoes done <id> or failed <id> into a status file in the SFTP chroot. CI polls over SFTP and proceeds only when it sees its own run id, so a stale status from a previous run never matches. That is what lets a post-publish step, such as scaling a pay-as-you-go VM back down, safely wait for the real finish.

The CDN cache split

Serving 25,000 monthly users cheaply comes down to one decision. Because everything is a static file, nginx emits ETag and Last-Modified on every response, and a CDN edge can revalidate with a cheap 304. The cache policy splits the tree in two.

Worth noting that the flatpak commit’s detached GPG signature, a *.commitmeta file, should be treated differently. It lives under the otherwise-immutable objects/ tree, but it is mutable. It is absent until the commit is signed, and rewritten on re-sign or key rotation. Caching it as an immutable payload would let a CDN pin a stale or missing signature across a publish, and clients reject that outright with β€œGPG verification enabled, but no signatures found”. So a preceding regex location carves it out to the short metadata TTL. Getting that one path wrong is the difference between a CDN that works and one that intermittently breaks installs.

This split is also what makes putting the repo behind a CDN safe. A client never sees a fresh index pointing at packages the edge cannot serve yet, because the index always expires faster than the payloads it references.

If you want to explore the real implementation, check out the repo with the source code. It is hosted at repo.nechunaev.com.

5. Rollout

The repository was added alongside existing channels, each of which served a different slice of the userbase and got a different upgrade out of it.

Most users were on the AppImage, which already had a working auto-update mechanism, so for them the new repo is not a rescue. It is an additional, more native option. A few thousand users were on deb and rpm packages attached to GitHub releases, downloading and installing them by hand with no update path at all. This is the group the repository changes most, turning a manual re-download every release into a normal apt upgrade or dnf upgrade. A more recent Snap channel on Snapcraft.io served under a thousand users. Flatpak did not exist for this app before the project, so that format is a brand-new channel rather than a migration.

That framing shaped the rollout into addition rather than cutover. Nobody is forced to move. The AppImage and Snap channels keep working, the old GitHub-release packages stay where they are, and the repository simply becomes the recommended path for users who want signed, auto-updating deb and rpm installs, plus the only path for flatpak. The migration that actually matters is the hand-install deb and rpm cohort discovering they no longer have to watch for updates. That migration is opt-in by changing where they install from, not a breaking change pushed at anyone.

Provisioning the host is deliberately a single step. A fresh Debian or Ubuntu machine becomes a live repository with one Ansible playbook run. The playbook lays down the deployed project at /srv/repo and renders the nginx config and landing page from the same .env that drives everything else. There is no multi-stage bring-up to sequence.

Publishing is fully automated from the first release onward. On every release, CI signs each artifact with the maintainer key, SFTPs the packages into the drop-zone, and drops the trigger marker. The server then verifies signatures, regenerates and re-signs the indexes, and rebuilds the landing page. CI polls for its own run id in the status file and reports the release as succeeded or failed only once the server-side re-index has actually finished. So the automated publish is end-to-end from tag to live repo, with no manual step in the happy path.

A package repository has exactly one test that proves it works: a clean-machine install. Before relying on any change that touches the publish path, the nginx config, or the Ansible role, I run a real apt install, dnf install, and flatpak install from all three repos on freshly-provisioned VMs. These are QEMU guests restored from clean-install snapshots, so each test starts from a genuinely pristine state rather than a machine my earlier tests have already contaminated. It is a manual gate today. Automating it is feasible but out of scope for this iteration.

6. Results

7. What I’d do differently

Bind the trust split tighter at the client. For rpm and flatpak, clients import both the maintainer and repo public keys into a single trust set used for both package and index checks. That means a repo-key compromise on the server could be used to forge packages those clients would accept, so the package-versus-index split I rely on is not fully enforceable on the client side. The server-side verification gate, the repo key’s restrictive permissions, host hardening, and an offline encrypted key backup are what bound the blast radius today. If I revisited this, I would dig into per-format mechanisms for keeping those trust sets genuinely separate at the client, and accept that some of it may simply be a constraint of the ecosystems rather than something I can close.

Revisit freezing architectures at first publish. The apt index layout fixes its architecture list at the first publish, which was a fine call for a single app shipping to a known set of targets. But adding an architecture later is more painful than it should be. If I were doing it again I would at least make that decision explicit and reversible from the start, rather than discovering the constraint when I need a new target.

Match the distribution model to the scale, instead of defaulting to self-hosted. This is the big one, and my answer genuinely depends on the numbers.

The thing I would keep in all three cases is the discipline that made this work. That means one source of truth for configuration, fail-closed signature verification, and the cache split, none of which is specific to self-hosting.