Based on [[Plan9]]'s mk(1) (simpler, more focused version of GNU/BSD Make)
- mk(1) is different enough from make(1) to be distinct while also simple enough for users to pick up easily
Basic OS image with well-known, writable locations for system-wide configuration
- Executed every time CI runs, must be fast
- Has the following packages installed
  - podman
  - mk
  - git
  - rsync
  - janet
    - If I stick with [[Janet]] as the implementation language
  - Whatever is included in the base, probably [[Alpine Linux]] image
Utilize well-known files and a handful of virtual targets to perform common functions
- test:V:
- prepare-runtime:V: base-image
  - Called before every CI job within the host OS
- base-image:V:
- /var/run/ci/env/CONTAINERIMAGE: base-image
  - echo $CONTAINERIMAGE > $target
- /var/run/ci/crontab
- /var/run/ci/artifacts/%
- /var/run/ci/cache/%
- Add dependencies to these targets to orchestrate CI
- Some targets, such as base-image:V: are executed in the base OS and used to set up the hermetic execution environment
Stubs for common CI platforms like [[GitLab]], [[GitHub]], etc.
- Custom targets can be executed from stubs if desired
Simple command that wraps qemu-system that creates a daemon process
- Injects credential/tokens for authentication to the orchestrator
- Maintains outputs for recent CI runs that can be fetched for playback
Utilizes git hash/ref, environment variables to determine if CI can skip real execution
- Run tests locally in hermetically sealed environment, have it count towards your CI
- Do not need dedicated CI executors if you do not want Eliminating redundancy via centralized entry points and minimizing compute and wall clock time are core elements of making a pleasant development experience. Engineers are smart, but every line of code, every script, every package, is a decision and source of complexity.

Pleasant software makes good decisions while allowing the flexibility to make your own when appropriate. This is the fundamental concept of abstraction via software.

The shell is the lowest-common-denominator between your application and the environment (OS). Its pervasiveness makes it a good candidate as a default language for configuration.

Design #

Orchestrator #

A service that does very simple authentication management for users. Users log in using OAuth2 to [[GitLab]], [[GitHub]], or a [[Passkey]] (maybe? focus on first two). The primary purpose of this service is to provide authentication and authorization for connecting executors to another executor that may want to skip execution and simply play back results.

[!question] How do we map executors to a single account? I'm not interested in providing full RBAC, but I think basic "I have a token associated with this repo" should be sufficient.

During execution of a test, an executor A may ask the Orchestrator to query any other executors in the connected state whether they have executed tests for the current revision/hash that executor A is about to run tests for. If a connected executor B has a successful compatible test run, the Orchestrator proxies or facilitates setting up a peer-to-peer connection between executor A and executor B. Executor B send artifacts and stdout/stderr for its test execution to executor A, who then replays the result and exits successfully. If there is no executor that could fulfill the role of executor B, such as when there are no other executors in the connected state or none that have a successful compatible test run, executor A executes the test normally.

[!question] A better, simpler design might be to have a single service per repo, group, or user. The service provides no authentication itself, but utilizes externally managed mTLS to provide authentication. This alleviates some concerns with proxying and e2e encryption because the users that care enough will have the ability to host their own infrastructure. A SaaS solution that provides name-spacing could be a monetization scheme for those that want to move fast and don't have a threat model that dictates e2e encryption.

Executor #

The executor is a wrapper around qemu-system and a particular OS image set up to provide a consistent environment for executing tests. Not only does it provide setup-hooks via the mk(1) command and target, but it is responsible for running the image generated by the base-image:V: virtual target and exported as the CONTAINERIMAGE environment variable.

[!question] Should I make the CONTAINERIMAGE environment variable user-configurable? Probably not, I can't think of a reason why you would want to unless you were unlucky enough to have it overlap. Perhaps a more unique name is in order.

The executor does the following in order:

Boots up the core OS image using qemu-system
Clones the repository at the revision requested by the CI or simply copies the current version of the repo (ignoring dirty working tree!) into the core OS
Executes the mk prepare-runtime target from the root of the repository as the root user of the OS
Runs the container image provided by/bound to CONTAINERIMAGE and assumed to have been generated as part of the base-image target using the podman command, bind-mounting the root of the repository to the working directory of the image, and executing the mk test command
1. How do I get mk reliably into the base image?
2. I think create a target that echos the CONTAINERIMAGE to a well-known file that depends on base-image
3. Possibly bind-mount /usr/lib/plan9 into the container and add bin/ to PATH
If successful, the /var/run/ci/artifacts/% target is run from the core OS image, copying any artifacts from the successful test execution into a directory where it may be retrieved later (note, the user must specify this rule)
The post-test:V: target is then run. By default this does nothing

The executor retains generated artifacts and stdout/stderr for a configurable amount of time, though by default this is only once. After executing a test, the executor will hang in the connected mode until a SIGINT or SIGTERM is received. This allows another executor to request test results from the runner. Alternatively, the executor may be configured to run in daemon mode, where it a small background process that orchestrates local test execution and may provide more than a single test run of data.

[!question] How should I configure what orchestrator an executor connects to

[!answer] Probably via the shims or environment variables stored in .env

Questions #

How do I connect a particular instance of an executor to a particular repo?
- I think using snapshots that more or less correspond to the following states
  - Completely fresh, only with the baseline dependencies
  - Per-repo snapshot that gets updated with new cached objects
    - Cached version gets periodically dropped or manually with a flag
How do I namespace on repo?
- Most git repos will have a URI for their origin, but this isn't a guarantee
How do I build and distribute the executor OS base image?
Some environment variables/arguments to the entrypoint should count against whether the CI should be allowed to count as a "successful run of this version of the code"
- Some users may want some subset of jobs to run in a real CI runner vs a developers workstation. We may choose to support this case, however I think it should be out of scope initially
- The publishing of artifacts and security scans are about the only thing I can think of that this would benefit greatly from
- Some environment environment variables should be injected but their values shouldn't be counted/hashed. These variables should generally be restricted to tokens and other secrets. Perhaps we have a separate utility for secrets specifically?
Should we support synchronization of caches?
- No, not initially

last updated: 2024-09-02