CI As it Should Be

· ngp's blog


Pleasant software makes good decisions while allowing the flexibility to make your own when appropriate. This is the fundamental concept of abstraction via software.

The shell is the lowest-common-denominator between your application and the environment (OS). Its pervasiveness makes it a good candidate as a default language for configuration.

Design #

Orchestrator #

A service that does very simple authentication management for users. Users log in using OAuth2 to [[GitLab]], [[GitHub]], or a [[Passkey]] (maybe? focus on first two). The primary purpose of this service is to provide authentication and authorization for connecting executors to another executor that may want to skip execution and simply play back results.

[!question] How do we map executors to a single account? I'm not interested in providing full RBAC, but I think basic "I have a token associated with this repo" should be sufficient.

During execution of a test, an executor A may ask the Orchestrator to query any other executors in the connected state whether they have executed tests for the current revision/hash that executor A is about to run tests for. If a connected executor B has a successful compatible test run, the Orchestrator proxies or facilitates setting up a peer-to-peer connection between executor A and executor B. Executor B send artifacts and stdout/stderr for its test execution to executor A, who then replays the result and exits successfully. If there is no executor that could fulfill the role of executor B, such as when there are no other executors in the connected state or none that have a successful compatible test run, executor A executes the test normally.

[!question] A better, simpler design might be to have a single service per repo, group, or user. The service provides no authentication itself, but utilizes externally managed mTLS to provide authentication. This alleviates some concerns with proxying and e2e encryption because the users that care enough will have the ability to host their own infrastructure. A SaaS solution that provides name-spacing could be a monetization scheme for those that want to move fast and don't have a threat model that dictates e2e encryption.

Executor #

The executor is a wrapper around qemu-system and a particular OS image set up to provide a consistent environment for executing tests. Not only does it provide setup-hooks via the mk(1) command and target, but it is responsible for running the image generated by the base-image:V: virtual target and exported as the CONTAINERIMAGE environment variable.

[!question] Should I make the CONTAINERIMAGE environment variable user-configurable? Probably not, I can't think of a reason why you would want to unless you were unlucky enough to have it overlap. Perhaps a more unique name is in order.

The executor does the following in order:

  1. Boots up the core OS image using qemu-system
  2. Clones the repository at the revision requested by the CI or simply copies the current version of the repo (ignoring dirty working tree!) into the core OS
  3. Executes the mk prepare-runtime target from the root of the repository as the root user of the OS
  4. Runs the container image provided by/bound to CONTAINERIMAGE and assumed to have been generated as part of the base-image target using the podman command, bind-mounting the root of the repository to the working directory of the image, and executing the mk test command
    1. How do I get mk reliably into the base image?
    2. I think create a target that echos the CONTAINERIMAGE to a well-known file that depends on base-image
    3. Possibly bind-mount /usr/lib/plan9 into the container and add bin/ to PATH
  5. If successful, the /var/run/ci/artifacts/% target is run from the core OS image, copying any artifacts from the successful test execution into a directory where it may be retrieved later (note, the user must specify this rule)
  6. The post-test:V: target is then run. By default this does nothing

The executor retains generated artifacts and stdout/stderr for a configurable amount of time, though by default this is only once. After executing a test, the executor will hang in the connected mode until a SIGINT or SIGTERM is received. This allows another executor to request test results from the runner. Alternatively, the executor may be configured to run in daemon mode, where it a small background process that orchestrates local test execution and may provide more than a single test run of data.

[!question] How should I configure what orchestrator an executor connects to

[!answer] Probably via the shims or environment variables stored in .env

Questions #