docs: add full command reference; fix module path and KernelURL config
- Add docs/commands.md with per-command purpose, step-by-step shell/SDK call sequences, config tables, outputs, and error conditions - Rename module from github.com/you/fc-orchestrator to github.com/kacerr/fc-orchestrator - Add KernelURL field to Config so the download URL is configurable via FC_KERNEL_URL instead of being hardcoded in Init() - Expose FC_KERNEL_URL in the usage string - Add verbose logging of dd/mkfs.ext4/mount/tar calls in buildRootfs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
770
docs/commands.md
Normal file
770
docs/commands.md
Normal file
@@ -0,0 +1,770 @@
|
||||
# fc-orch Command Reference
|
||||
|
||||
`fc-orch` is a Firecracker microVM snapshot orchestrator. It creates a single "golden" VM, snapshots it at a stable boot state, then rapidly restores arbitrarily many clones from that snapshot. Clones share the golden memory file via Linux kernel MAP_PRIVATE copy-on-write, so the incremental cost of each clone is only its dirty pages and a private copy of the VM state file.
|
||||
|
||||
All commands require root privileges (`sudo`).
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Global Configuration](#global-configuration)
|
||||
- [Directory Layout](#directory-layout)
|
||||
- [Network Topology](#network-topology)
|
||||
- [`init`](#init)
|
||||
- [`golden`](#golden)
|
||||
- [`spawn`](#spawn)
|
||||
- [`status`](#status)
|
||||
- [`kill`](#kill)
|
||||
- [`cleanup`](#cleanup)
|
||||
|
||||
---
|
||||
|
||||
## Global Configuration
|
||||
|
||||
All tunables are set via environment variables. Every variable has a default; none are required unless you want non-default behavior.
|
||||
|
||||
| Variable | Default | Description |
|
||||
|---|---|---|
|
||||
| `FC_BIN` | `firecracker` | Path or name of the Firecracker binary (resolved via `$PATH`) |
|
||||
| `FC_BASE_DIR` | `/tmp/fc-orch` | Root working directory for all state |
|
||||
| `FC_KERNEL` | `$FC_BASE_DIR/vmlinux` | Path to the kernel image |
|
||||
| `FC_KERNEL_URL` | Pinned Firecracker CI build (vmlinux-6.1.166) | URL to download the kernel if `FC_KERNEL` is missing |
|
||||
| `FC_ROOTFS` | `$FC_BASE_DIR/rootfs.ext4` | Path to the base ext4 rootfs image |
|
||||
| `FC_VCPUS` | `1` | Number of vCPUs per VM |
|
||||
| `FC_MEM_MIB` | `128` | Memory per VM in MiB |
|
||||
| `FC_BRIDGE` | `fcbr0` | Host bridge name. Set to `none` to disable all networking |
|
||||
| `FC_BRIDGE_CIDR` | `172.30.0.1/24` | IP address and prefix assigned to the host bridge |
|
||||
| `FC_GUEST_PREFIX` | `172.30.0` | IP prefix for guest address allocation |
|
||||
| `FC_GUEST_GW` | `172.30.0.1` | Default gateway advertised to guests |
|
||||
|
||||
Kernel boot arguments are hardcoded and not user-configurable:
|
||||
|
||||
```
|
||||
console=ttyS0 reboot=k panic=1 pci=off i8042.noaux quiet loglevel=0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Directory Layout
|
||||
|
||||
After running all commands, `$FC_BASE_DIR` (`/tmp/fc-orch` by default) contains:
|
||||
|
||||
```
|
||||
/tmp/fc-orch/
|
||||
├── vmlinux # kernel image (shared, immutable)
|
||||
├── rootfs.ext4 # base Alpine rootfs (shared, immutable)
|
||||
├── golden/
|
||||
│ ├── api.sock # Firecracker API socket (golden VM, transient)
|
||||
│ ├── rootfs.ext4 # COW copy of base rootfs used by golden VM
|
||||
│ ├── mem # memory snapshot (read by all clones, never written)
|
||||
│ └── vmstate # VM state snapshot (golden reference)
|
||||
├── clones/
|
||||
│ ├── 1/
|
||||
│ │ ├── api.sock # Firecracker API socket (clone 1)
|
||||
│ │ ├── rootfs.ext4 # private COW copy of golden rootfs
|
||||
│ │ └── vmstate # private copy of golden vmstate
|
||||
│ ├── 2/
|
||||
│ │ └── ...
|
||||
│ └── N/
|
||||
│ └── ...
|
||||
└── pids/
|
||||
├── golden.pid # PID of golden Firecracker process (transient)
|
||||
├── clone-1.pid
|
||||
├── clone-2.pid
|
||||
└── ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Network Topology
|
||||
|
||||
When `FC_BRIDGE` is not `none` (the default), a Linux bridge and per-VM TAP devices are used:
|
||||
|
||||
```
|
||||
Internet
|
||||
│
|
||||
Host NIC (e.g. eth0)
|
||||
│
|
||||
iptables NAT MASQUERADE
|
||||
│
|
||||
Bridge: fcbr0 (172.30.0.1/24)
|
||||
├── fctap0 (golden VM — exists only during golden command)
|
||||
├── fctap1 (clone 1)
|
||||
├── fctap2 (clone 2)
|
||||
└── fctapN (clone N)
|
||||
```
|
||||
|
||||
Each clone receives a unique TAP device and MAC address (`AA:FC:00:00:XX:XX`). IP assignment inside the guest is the guest OS's responsibility (the rootfs init script only brings `eth0` up; no DHCP server is included).
|
||||
|
||||
Set `FC_BRIDGE=none` to skip all network configuration. VMs will boot without a network interface.
|
||||
|
||||
---
|
||||
|
||||
## `init`
|
||||
|
||||
### Purpose
|
||||
|
||||
Downloads the Linux kernel image and builds a minimal Alpine Linux ext4 rootfs. This command only needs to run once; both artifacts are reused by all subsequent `golden` invocations. `init` is idempotent — it skips any artifact that already exists on disk.
|
||||
|
||||
### Usage
|
||||
|
||||
```sh
|
||||
sudo ./fc-orch init
|
||||
```
|
||||
|
||||
Optional overrides:
|
||||
|
||||
```sh
|
||||
sudo FC_KERNEL_URL=https://example.com/vmlinux FC_BASE_DIR=/data/fc ./fc-orch init
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- `dd`, `mkfs.ext4` (e2fsprogs), `mount`/`umount` (util-linux), `tar` must be in `$PATH`
|
||||
- Internet access to download kernel and Alpine tarball
|
||||
- Root privileges (required for `mount`)
|
||||
|
||||
### Step-by-step execution
|
||||
|
||||
1. **Create base directory**
|
||||
|
||||
```sh
|
||||
mkdir -p /tmp/fc-orch
|
||||
```
|
||||
|
||||
2. **Download kernel** (skipped if `/tmp/fc-orch/vmlinux` already exists)
|
||||
|
||||
```
|
||||
GET https://s3.amazonaws.com/spec.ccfc.min/firecracker-ci/20260408-ce2a467895c1-0/x86_64/vmlinux-6.1.166
|
||||
→ /tmp/fc-orch/vmlinux
|
||||
```
|
||||
|
||||
3. **Create empty ext4 image** (skipped if `/tmp/fc-orch/rootfs.ext4` already exists)
|
||||
|
||||
```sh
|
||||
dd if=/dev/zero of=/tmp/fc-orch/rootfs.ext4 bs=1M count=512 status=none
|
||||
```
|
||||
|
||||
4. **Format as ext4**
|
||||
|
||||
```sh
|
||||
mkfs.ext4 -qF /tmp/fc-orch/rootfs.ext4
|
||||
```
|
||||
|
||||
5. **Mount the image**
|
||||
|
||||
```sh
|
||||
mkdir -p /tmp/fc-orch/mnt
|
||||
mount -o loop /tmp/fc-orch/rootfs.ext4 /tmp/fc-orch/mnt
|
||||
```
|
||||
|
||||
6. **Download Alpine minirootfs tarball**
|
||||
|
||||
```
|
||||
GET https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-minirootfs-3.20.0-x86_64.tar.gz
|
||||
→ /tmp/fc-orch/alpine-minirootfs-3.20.0-x86_64.tar.gz
|
||||
```
|
||||
|
||||
7. **Extract Alpine into the mounted image**
|
||||
|
||||
```sh
|
||||
tar xzf /tmp/fc-orch/alpine-minirootfs-3.20.0-x86_64.tar.gz -C /tmp/fc-orch/mnt
|
||||
```
|
||||
|
||||
8. **Write `/etc/init.d/rcS`** inside the mounted image
|
||||
|
||||
```sh
|
||||
#!/bin/sh
|
||||
mount -t proc proc /proc
|
||||
mount -t sysfs sys /sys
|
||||
mount -t devtmpfs devtmpfs /dev
|
||||
ip link set eth0 up 2>/dev/null
|
||||
```
|
||||
|
||||
9. **Write `/etc/inittab`** inside the mounted image
|
||||
|
||||
```
|
||||
::sysinit:/etc/init.d/rcS
|
||||
ttyS0::respawn:/bin/sh
|
||||
```
|
||||
|
||||
This causes the guest to launch a shell on the serial console (`ttyS0`) and respawn it if it exits.
|
||||
|
||||
10. **Unmount the image**
|
||||
|
||||
```sh
|
||||
umount /tmp/fc-orch/mnt
|
||||
```
|
||||
|
||||
### Outputs
|
||||
|
||||
| Path | Description |
|
||||
|---|---|
|
||||
| `/tmp/fc-orch/vmlinux` | Linux kernel image for Firecracker |
|
||||
| `/tmp/fc-orch/rootfs.ext4` | 512 MiB Alpine Linux ext4 image |
|
||||
|
||||
### Error conditions
|
||||
|
||||
| Error | Cause | Resolution |
|
||||
|---|---|---|
|
||||
| `download kernel: ...` | Network failure or bad `FC_KERNEL_URL` | Check connectivity; verify the URL |
|
||||
| `download alpine: ...` | Network failure downloading Alpine tarball | Check connectivity |
|
||||
| `build rootfs: ...` | `dd`, `mkfs.ext4`, `mount`, or `tar` failed | Ensure the tools are installed and you are running as root |
|
||||
|
||||
---
|
||||
|
||||
## `golden`
|
||||
|
||||
### Purpose
|
||||
|
||||
Boots a fresh Firecracker VM from the base artifacts produced by `init`, waits 3 seconds for the guest to finish its init sequence, pauses the VM, and takes a snapshot of the entire machine state (memory + VM state). The snapshot artifacts are the input to every `spawn` invocation. The golden VM process is terminated after snapshotting — only the artifacts on disk are kept.
|
||||
|
||||
This command always recreates the golden directory from scratch, discarding any previous snapshot.
|
||||
|
||||
### Usage
|
||||
|
||||
```sh
|
||||
sudo ./fc-orch golden
|
||||
```
|
||||
|
||||
Optional overrides:
|
||||
|
||||
```sh
|
||||
sudo FC_MEM_MIB=256 FC_VCPUS=2 ./fc-orch golden
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- `init` must have been run (kernel and rootfs must exist)
|
||||
- `firecracker` binary must be in `$PATH` (or set via `FC_BIN`)
|
||||
- `ip`, `iptables`, `sysctl` must be in `$PATH` (when networking is enabled)
|
||||
|
||||
### Step-by-step execution
|
||||
|
||||
1. **Verify prerequisites**
|
||||
|
||||
Checks that `FC_KERNEL` and `FC_ROOTFS` exist. Exits with an error if either is missing and directs the user to run `init`.
|
||||
|
||||
2. **Recreate golden directory**
|
||||
|
||||
```sh
|
||||
rm -rf /tmp/fc-orch/golden
|
||||
mkdir -p /tmp/fc-orch/golden /tmp/fc-orch/pids
|
||||
```
|
||||
|
||||
3. **COW copy of base rootfs**
|
||||
|
||||
```sh
|
||||
cp --reflink=always /tmp/fc-orch/rootfs.ext4 /tmp/fc-orch/golden/rootfs.ext4
|
||||
```
|
||||
|
||||
On filesystems that do not support reflinks (e.g. ext4), this falls back to a regular byte-for-byte copy via `io.Copy`. On btrfs or xfs, the reflink is instant and consumes no additional space until the VM writes to the disk.
|
||||
|
||||
4. **Network setup** (skipped when `FC_BRIDGE=none`)
|
||||
|
||||
a. Create bridge (idempotent — skipped if `fcbr0` already exists):
|
||||
|
||||
```sh
|
||||
ip link add fcbr0 type bridge
|
||||
ip addr add 172.30.0.1/24 dev fcbr0
|
||||
ip link set fcbr0 up
|
||||
```
|
||||
|
||||
b. Enable IP forwarding and NAT:
|
||||
|
||||
```sh
|
||||
ip -4 route show default # detect egress interface, e.g. "eth0"
|
||||
sysctl -qw net.ipv4.ip_forward=1
|
||||
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
|
||||
```
|
||||
|
||||
c. Create and attach the golden TAP device:
|
||||
|
||||
```sh
|
||||
ip tuntap add dev fctap0 mode tap
|
||||
ip link set fctap0 up
|
||||
ip link set fctap0 master fcbr0
|
||||
```
|
||||
|
||||
5. **Build Firecracker machine configuration** (passed to the SDK in memory):
|
||||
|
||||
```
|
||||
SocketPath: /tmp/fc-orch/golden/api.sock
|
||||
KernelImagePath: /tmp/fc-orch/vmlinux
|
||||
KernelArgs: console=ttyS0 reboot=k panic=1 pci=off i8042.noaux quiet loglevel=0
|
||||
MachineCfg:
|
||||
VcpuCount: 1
|
||||
MemSizeMib: 128
|
||||
TrackDirtyPages: true ← required for snapshot support
|
||||
Drives:
|
||||
- DriveID: rootfs
|
||||
PathOnHost: /tmp/fc-orch/golden/rootfs.ext4
|
||||
IsRootDevice: true
|
||||
IsReadOnly: false
|
||||
NetworkInterfaces:
|
||||
- MacAddress: AA:FC:00:00:00:01
|
||||
HostDevName: fctap0
|
||||
```
|
||||
|
||||
6. **Launch Firecracker process**
|
||||
|
||||
The Firecracker Go SDK spawns:
|
||||
|
||||
```sh
|
||||
firecracker --api-sock /tmp/fc-orch/golden/api.sock
|
||||
```
|
||||
|
||||
The SDK then applies the machine configuration via HTTP calls to the Firecracker API socket.
|
||||
|
||||
7. **Boot the VM**
|
||||
|
||||
```go
|
||||
m.Start(ctx) // SDK call — PUT /actions {"action_type": "InstanceStart"}
|
||||
```
|
||||
|
||||
The golden VM PID is written to `/tmp/fc-orch/pids/golden.pid`.
|
||||
|
||||
8. **Wait for guest init to settle**
|
||||
|
||||
```go
|
||||
time.Sleep(3 * time.Second)
|
||||
```
|
||||
|
||||
This is a fixed delay. The guest's `rcS` script mounts pseudo-filesystems and brings up `eth0`. 3 seconds is conservative enough for the Alpine init sequence to complete.
|
||||
|
||||
9. **Pause the VM**
|
||||
|
||||
```go
|
||||
m.PauseVM(ctx) // SDK call — PATCH /vm {"state": "Paused"}
|
||||
```
|
||||
|
||||
The VM's vCPUs are frozen. No guest code runs after this point.
|
||||
|
||||
10. **Create snapshot**
|
||||
|
||||
```go
|
||||
m.CreateSnapshot(ctx,
|
||||
"/tmp/fc-orch/golden/mem",
|
||||
"/tmp/fc-orch/golden/vmstate",
|
||||
)
|
||||
// SDK call — PUT /snapshot/create
|
||||
// {
|
||||
// "mem_file_path": "/tmp/fc-orch/golden/mem",
|
||||
// "snapshot_path": "/tmp/fc-orch/golden/vmstate",
|
||||
// "snapshot_type": "Full"
|
||||
// }
|
||||
```
|
||||
|
||||
- `mem`: full dump of guest physical memory (~128 MiB). All clones map this file read-only; the kernel's MAP_PRIVATE gives each clone copy-on-write semantics over it.
|
||||
- `vmstate`: serialized vCPU and device state (typically a few hundred KiB). Each clone gets its own copy.
|
||||
|
||||
Sizes of both files are logged.
|
||||
|
||||
11. **Terminate golden VM**
|
||||
|
||||
```go
|
||||
m.StopVMM() // SDK call — PUT /actions {"action_type": "SendCtrlAltDel"}
|
||||
```
|
||||
|
||||
12. **Destroy golden TAP device**
|
||||
|
||||
```sh
|
||||
ip link del fctap0
|
||||
```
|
||||
|
||||
### Outputs
|
||||
|
||||
| Path | Description |
|
||||
|---|---|
|
||||
| `/tmp/fc-orch/golden/mem` | Full memory snapshot (~`FC_MEM_MIB` MiB) |
|
||||
| `/tmp/fc-orch/golden/vmstate` | VM state snapshot (vCPU registers, device state) |
|
||||
| `/tmp/fc-orch/golden/rootfs.ext4` | COW copy of base rootfs (not needed after snapshotting, kept for reference) |
|
||||
|
||||
### Error conditions
|
||||
|
||||
| Error | Cause | Resolution |
|
||||
|---|---|---|
|
||||
| `kernel not found — run init first` | `FC_KERNEL` path does not exist | Run `init` first |
|
||||
| `rootfs not found — run init first` | `FC_ROOTFS` path does not exist | Run `init` first |
|
||||
| `firecracker binary not found` | `FC_BIN` not in `$PATH` | Install Firecracker or set `FC_BIN` |
|
||||
| `create bridge: ...` | `ip link add` failed | Check if another bridge with the same name exists with incompatible config |
|
||||
| `start golden VM: ...` | Firecracker failed to boot | Check Firecracker logs; verify kernel and rootfs are valid |
|
||||
| `pause VM: ...` | VM did not reach a pauseable state in 3s | Increase settle time in source or investigate guest crash via serial console |
|
||||
| `create snapshot: ...` | Snapshot write failed | Check disk space in `FC_BASE_DIR` |
|
||||
|
||||
---
|
||||
|
||||
## `spawn`
|
||||
|
||||
### Purpose
|
||||
|
||||
Restores one or more VM clones from the golden snapshot. Each clone is an independent Firecracker process that resumes execution from exactly the paused state captured by `golden`. Clones differ only in their TAP device, MAC address, rootfs COW layer, and vmstate file; they all map the same `golden/mem` file.
|
||||
|
||||
Clone IDs are auto-incremented: if clones 1–3 already exist, the next `spawn 2` creates clones 4 and 5. Spawn can be called multiple times to add more clones incrementally.
|
||||
|
||||
### Usage
|
||||
|
||||
```sh
|
||||
sudo ./fc-orch spawn # spawn 1 clone (default)
|
||||
sudo ./fc-orch spawn 10 # spawn 10 clones
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- `golden` must have been run (`golden/vmstate` and `golden/mem` must exist)
|
||||
- `firecracker` binary must be available
|
||||
- Sufficient disk space for per-clone rootfs copies (each is a copy of the ~512 MiB golden rootfs, but reflinks are instant on btrfs/xfs)
|
||||
|
||||
### Step-by-step execution (per clone)
|
||||
|
||||
The following steps are performed once for each requested clone. Let `{id}` be the auto-assigned clone number (e.g. `1`, `2`, ...).
|
||||
|
||||
1. **Verify golden artifacts exist**
|
||||
|
||||
Checks for both `/tmp/fc-orch/golden/vmstate` and `/tmp/fc-orch/golden/mem`. Exits with an error if either is missing.
|
||||
|
||||
2. **Create directories**
|
||||
|
||||
```sh
|
||||
mkdir -p /tmp/fc-orch/clones /tmp/fc-orch/pids
|
||||
mkdir -p /tmp/fc-orch/clones/{id}
|
||||
```
|
||||
|
||||
3. **Setup bridge** (idempotent, skipped if bridge already exists or `FC_BRIDGE=none`)
|
||||
|
||||
Same sequence as step 4 of `golden`. No-op if `fcbr0` is already up.
|
||||
|
||||
4. **COW copy of golden rootfs**
|
||||
|
||||
```sh
|
||||
cp --reflink=always /tmp/fc-orch/golden/rootfs.ext4 /tmp/fc-orch/clones/{id}/rootfs.ext4
|
||||
```
|
||||
|
||||
Falls back to a full copy if reflinks are unsupported.
|
||||
|
||||
5. **Shared memory reference** (no copy)
|
||||
|
||||
The clone's Firecracker config will point directly at `/tmp/fc-orch/golden/mem`. No file operation is needed here — the kernel's MAP_PRIVATE ensures each clone's writes are private.
|
||||
|
||||
6. **Copy vmstate**
|
||||
|
||||
```sh
|
||||
# implemented as io.Copy in Go
|
||||
cp /tmp/fc-orch/golden/vmstate /tmp/fc-orch/clones/{id}/vmstate
|
||||
```
|
||||
|
||||
The vmstate file is small (typically < 1 MiB), so a full copy is cheap.
|
||||
|
||||
7. **Create and attach TAP device** (skipped when `FC_BRIDGE=none`)
|
||||
|
||||
```sh
|
||||
ip tuntap add dev fctap{id} mode tap
|
||||
ip link set fctap{id} up
|
||||
ip link set fctap{id} master fcbr0
|
||||
```
|
||||
|
||||
MAC address is derived from the clone ID:
|
||||
|
||||
```
|
||||
AA:FC:00:00:{id/256 in hex}:{id%256 in hex}
|
||||
# e.g. clone 1 → AA:FC:00:00:00:01
|
||||
# clone 2 → AA:FC:00:00:00:02
|
||||
# clone 256 → AA:FC:00:00:01:00
|
||||
```
|
||||
|
||||
8. **Build Firecracker snapshot-restore configuration** (in memory):
|
||||
|
||||
```
|
||||
SocketPath: /tmp/fc-orch/clones/{id}/api.sock
|
||||
MachineCfg:
|
||||
VcpuCount: 1
|
||||
MemSizeMib: 128
|
||||
NetworkInterfaces:
|
||||
- MacAddress: AA:FC:00:00:00:{id:02X}
|
||||
HostDevName: fctap{id}
|
||||
Snapshot:
|
||||
MemFilePath: /tmp/fc-orch/golden/mem ← shared, read-only mapping
|
||||
SnapshotPath: /tmp/fc-orch/clones/{id}/vmstate
|
||||
ResumeVM: true ← restore instead of fresh boot
|
||||
```
|
||||
|
||||
Note: `KernelImagePath` and `Drives` are omitted when restoring from a snapshot — Firecracker uses the snapshot state instead.
|
||||
|
||||
9. **Launch Firecracker process**
|
||||
|
||||
```sh
|
||||
firecracker --api-sock /tmp/fc-orch/clones/{id}/api.sock
|
||||
```
|
||||
|
||||
10. **Restore and resume VM**
|
||||
|
||||
```go
|
||||
m.Start(ctx)
|
||||
// SDK call — POST /snapshot/load
|
||||
// {
|
||||
// "mem_file_path": "/tmp/fc-orch/golden/mem",
|
||||
// "snapshot_path": "/tmp/fc-orch/clones/{id}/vmstate",
|
||||
// "resume_vm": true
|
||||
// }
|
||||
```
|
||||
|
||||
Restoration time (from `m.Start` call to return) is measured and logged.
|
||||
|
||||
11. **Record PID**
|
||||
|
||||
```sh
|
||||
echo {pid} > /tmp/fc-orch/pids/clone-{id}.pid
|
||||
```
|
||||
|
||||
12. **Register clone in memory**
|
||||
|
||||
The running clone is tracked in an in-process map keyed by clone ID, holding the Firecracker SDK handle, context cancel function, and TAP device name. This allows `kill` to cleanly terminate clones started in the same process invocation.
|
||||
|
||||
After all clones are spawned, `Status()` is called automatically to display the running clone table.
|
||||
|
||||
### Outputs
|
||||
|
||||
For each clone `{id}`:
|
||||
|
||||
| Path | Description |
|
||||
|---|---|
|
||||
| `/tmp/fc-orch/clones/{id}/rootfs.ext4` | Private COW copy of the golden rootfs |
|
||||
| `/tmp/fc-orch/clones/{id}/vmstate` | Private copy of golden vmstate |
|
||||
| `/tmp/fc-orch/clones/{id}/api.sock` | Firecracker API socket (live while clone is running) |
|
||||
| `/tmp/fc-orch/pids/clone-{id}.pid` | PID of this clone's Firecracker process |
|
||||
| `fctap{id}` | Host TAP network device attached to `fcbr0` |
|
||||
|
||||
### Error conditions
|
||||
|
||||
| Error | Cause | Resolution |
|
||||
|---|---|---|
|
||||
| `golden vmstate not found — run golden first` | `golden` has not been run | Run `golden` first |
|
||||
| `golden mem not found — run golden first` | Same as above | Run `golden` first |
|
||||
| `firecracker not found` | Binary missing | Install Firecracker or set `FC_BIN` |
|
||||
| `copy rootfs: ...` | Disk full or source missing | Check disk space; re-run `golden` |
|
||||
| `restore clone {id}: ...` | Firecracker failed to load snapshot | Check that `golden/mem` is not corrupted; re-run `golden` |
|
||||
| `create tap {name}: ...` | TAP device already exists or `ip` failed | Run `kill` to clean up stale TAPs, then retry |
|
||||
| Individual clone failure | Per-clone errors are logged but do not abort the batch | Check logs; surviving clones continue running |
|
||||
|
||||
---
|
||||
|
||||
## `status`
|
||||
|
||||
### Purpose
|
||||
|
||||
Displays a table of all clones that have been spawned, along with their PIDs and liveness. Liveness is determined by checking whether `/proc/{pid}` exists — a process that has exited will no longer appear in `/proc`.
|
||||
|
||||
This command does not require the clones to have been started in the current process invocation; it reads PID files written to disk by `spawn`.
|
||||
|
||||
### Usage
|
||||
|
||||
```sh
|
||||
sudo ./fc-orch status
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
None. Can be run at any time, even with no clones running.
|
||||
|
||||
### Step-by-step execution
|
||||
|
||||
1. **Read PID directory**
|
||||
|
||||
Lists all files in `/tmp/fc-orch/pids/`.
|
||||
|
||||
2. **Filter for clone PID files**
|
||||
|
||||
Only files whose names start with `clone-` are considered (excludes `golden.pid`).
|
||||
|
||||
3. **Check liveness**
|
||||
|
||||
For each file:
|
||||
|
||||
```sh
|
||||
# read pid from clone-{id}.pid
|
||||
test -d /proc/{pid} # alive if directory exists
|
||||
```
|
||||
|
||||
4. **Print table**
|
||||
|
||||
```
|
||||
=== Running clones ===
|
||||
clone-1 pid=12345 alive
|
||||
clone-2 pid=12346 alive
|
||||
clone-3 pid=12347 DEAD
|
||||
```
|
||||
|
||||
### Outputs
|
||||
|
||||
Prints to stdout only. No files are modified.
|
||||
|
||||
---
|
||||
|
||||
## `kill`
|
||||
|
||||
### Purpose
|
||||
|
||||
Terminates all running Firecracker VM processes and removes all TAP devices. Handles two cases:
|
||||
|
||||
- **In-memory clones**: clones started in the same `fc-orch` process invocation, tracked via the in-process clone map.
|
||||
- **Orphaned clones**: clones from a previous `fc-orch spawn` invocation, whose PIDs are recorded in PID files.
|
||||
|
||||
Both cases are always handled, so `kill` works correctly even after a restart or crash of the orchestrator process.
|
||||
|
||||
The command does **not** remove snapshot files or the `clones/` directory — use `cleanup` for full teardown.
|
||||
|
||||
### Usage
|
||||
|
||||
```sh
|
||||
sudo ./fc-orch kill
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
None. Safe to run even if no VMs are running.
|
||||
|
||||
### Step-by-step execution
|
||||
|
||||
1. **Stop in-memory clones** (clones started in this process invocation)
|
||||
|
||||
For each clone tracked in the in-process map:
|
||||
|
||||
```go
|
||||
clone.Machine.StopVMM() // SDK: PUT /actions {"action_type": "SendCtrlAltDel"}
|
||||
clone.Cancel() // cancels the clone's context
|
||||
```
|
||||
|
||||
```sh
|
||||
ip link del fctap{id}
|
||||
```
|
||||
|
||||
The clone is removed from the in-memory map.
|
||||
|
||||
2. **Kill orphaned processes from PID files**
|
||||
|
||||
For each file in `/tmp/fc-orch/pids/`:
|
||||
|
||||
```go
|
||||
// equivalent to:
|
||||
kill -9 {pid}
|
||||
```
|
||||
|
||||
The PID file is then deleted:
|
||||
|
||||
```sh
|
||||
rm /tmp/fc-orch/pids/{file}
|
||||
```
|
||||
|
||||
3. **Destroy stale TAP devices**
|
||||
|
||||
```sh
|
||||
ip -o link show
|
||||
```
|
||||
|
||||
Each line containing `fctap` is parsed to extract the device name, then:
|
||||
|
||||
```sh
|
||||
ip link del {tapname}
|
||||
```
|
||||
|
||||
This cleans up any TAP devices left over from previous runs that were not removed by steps 1 or 2.
|
||||
|
||||
### Outputs
|
||||
|
||||
All Firecracker processes are terminated. All `fctap*` network interfaces are removed. PID files are deleted. The `clones/`, `golden/`, and `pids/` directories themselves remain on disk.
|
||||
|
||||
### Error conditions
|
||||
|
||||
Errors from individual `StopVMM`, `kill`, or `ip link del` calls are logged but do not abort the rest of the kill sequence. The command always attempts to clean up everything it finds.
|
||||
|
||||
---
|
||||
|
||||
## `cleanup`
|
||||
|
||||
### Purpose
|
||||
|
||||
Performs a full teardown: kills all VMs (same as `kill`), removes all working directories under `$FC_BASE_DIR`, and tears down the host bridge. After `cleanup`, the system is in the same state as before `golden` was run. The base kernel and rootfs files are **not** removed — only `golden/`, `clones/`, and `pids/` are deleted.
|
||||
|
||||
### Usage
|
||||
|
||||
```sh
|
||||
sudo ./fc-orch cleanup
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
None. Safe to run at any time.
|
||||
|
||||
### Step-by-step execution
|
||||
|
||||
1. **Kill all VMs**
|
||||
|
||||
Performs the full `kill` sequence (see [`kill`](#kill) above).
|
||||
|
||||
2. **Remove working directories**
|
||||
|
||||
```sh
|
||||
rm -rf /tmp/fc-orch/clones
|
||||
rm -rf /tmp/fc-orch/golden
|
||||
rm -rf /tmp/fc-orch/pids
|
||||
```
|
||||
|
||||
3. **Tear down bridge** (skipped when `FC_BRIDGE=none`)
|
||||
|
||||
```sh
|
||||
ip link del fcbr0
|
||||
```
|
||||
|
||||
This also implicitly removes all addresses and routes associated with the bridge. The iptables NAT rule added during `golden`/`spawn` is **not** removed automatically — remove it manually if needed:
|
||||
|
||||
```sh
|
||||
iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
|
||||
```
|
||||
|
||||
### Outputs
|
||||
|
||||
After `cleanup`:
|
||||
|
||||
- All Firecracker processes are terminated
|
||||
- `/tmp/fc-orch/clones/`, `/tmp/fc-orch/golden/`, `/tmp/fc-orch/pids/` are deleted
|
||||
- `fcbr0` bridge and all associated `fctap*` devices are removed
|
||||
- `/tmp/fc-orch/vmlinux` and `/tmp/fc-orch/rootfs.ext4` remain intact
|
||||
|
||||
To also remove the kernel and rootfs:
|
||||
|
||||
```sh
|
||||
sudo ./fc-orch cleanup
|
||||
rm -f /tmp/fc-orch/vmlinux /tmp/fc-orch/rootfs.ext4
|
||||
```
|
||||
|
||||
### Error conditions
|
||||
|
||||
Errors from `rm -rf` or `ip link del` are ignored (logged only). Cleanup always completes all steps.
|
||||
|
||||
---
|
||||
|
||||
## Typical Workflow
|
||||
|
||||
```sh
|
||||
# One-time setup: download kernel and build rootfs
|
||||
sudo ./fc-orch init
|
||||
|
||||
# Create golden snapshot (re-run any time you want a fresh baseline)
|
||||
sudo ./fc-orch golden
|
||||
|
||||
# Spawn clones
|
||||
sudo ./fc-orch spawn 10
|
||||
|
||||
# Check what's running
|
||||
sudo ./fc-orch status
|
||||
|
||||
# Add more clones to the existing set
|
||||
sudo ./fc-orch spawn 5
|
||||
|
||||
# Tear down all clones (keeps golden snapshot)
|
||||
sudo ./fc-orch kill
|
||||
|
||||
# Full reset (keeps kernel and rootfs)
|
||||
sudo ./fc-orch cleanup
|
||||
```
|
||||
2
go.mod
2
go.mod
@@ -1,4 +1,4 @@
|
||||
module github.com/you/fc-orchestrator
|
||||
module github.com/kacerr/fc-orchestrator
|
||||
|
||||
go 1.23
|
||||
|
||||
|
||||
2
go.sum
2
go.sum
@@ -686,7 +686,7 @@ github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415/go.mod h1:
|
||||
github.com/xeipuuv/gojsonschema v0.0.0-20180618132009-1d523034197f/go.mod h1:5yf86TLmAcydyeJq5YvxkGPE2fm/u4myDekKRoLuqhs=
|
||||
github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU=
|
||||
github.com/xordataexchange/crypt v0.0.3-0.20170626215501-b2862e3d0a77/go.mod h1:aYKd//L2LvnjZzWKhF00oedf4jCCReLcmhLdhm1A27Q=
|
||||
github.com/youmark/pkcs8 v0.0.0-20181117223130-1be2e3e5546d/go.mod h1:rHwXgn7JulP+udvsHwJoVG1YGAP6VLg4y9I5dyZdqmA=
|
||||
github.com/kacerrmark/pkcs8 v0.0.0-20181117223130-1be2e3e5546d/go.mod h1:rHwXgn7JulP+udvsHwJoVG1YGAP6VLg4y9I5dyZdqmA=
|
||||
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
|
||||
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
|
||||
github.com/yvasiyarov/go-metrics v0.0.0-20140926110328-57bccd1ccd43/go.mod h1:aX5oPXxHm3bOH+xeAttToC8pqch2ScQN/JoXYupl6xs=
|
||||
|
||||
10
main.go
10
main.go
@@ -19,10 +19,17 @@ import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/you/fc-orchestrator/orchestrator"
|
||||
"github.com/kacerr/fc-orchestrator/orchestrator"
|
||||
)
|
||||
|
||||
func main() {
|
||||
// figure out if we are running as root
|
||||
if os.Geteuid() == 0 {
|
||||
fmt.Println("Running with root/sudo privileges!")
|
||||
} else {
|
||||
fmt.Println("Running as a normal user.")
|
||||
}
|
||||
|
||||
if len(os.Args) < 2 {
|
||||
usage()
|
||||
os.Exit(1)
|
||||
@@ -68,6 +75,7 @@ Environment:
|
||||
FC_BIN firecracker binary path (default: firecracker)
|
||||
FC_BASE_DIR working directory (default: /tmp/fc-orch)
|
||||
FC_KERNEL vmlinux path
|
||||
FC_KERNEL_URL vmlinux download URL (default: pinned Firecracker CI build)
|
||||
FC_ROOTFS rootfs.ext4 path
|
||||
FC_VCPUS vCPUs per VM (default: 1)
|
||||
FC_MEM_MIB MiB per VM (default: 128)
|
||||
|
||||
@@ -10,6 +10,7 @@ type Config struct {
|
||||
FCBin string // path to firecracker binary
|
||||
BaseDir string // working directory for all state
|
||||
Kernel string // path to vmlinux
|
||||
KernelURL string // URL to download vmlinux if Kernel file is missing
|
||||
Rootfs string // path to base rootfs.ext4
|
||||
VCPUs int64
|
||||
MemMiB int64
|
||||
@@ -33,6 +34,8 @@ func DefaultConfig() Config {
|
||||
BootArgs: "console=ttyS0 reboot=k panic=1 pci=off i8042.noaux quiet loglevel=0",
|
||||
}
|
||||
c.Kernel = envOr("FC_KERNEL", c.BaseDir+"/vmlinux")
|
||||
c.KernelURL = envOr("FC_KERNEL_URL",
|
||||
"https://s3.amazonaws.com/spec.ccfc.min/firecracker-ci/20260408-ce2a467895c1-0/x86_64/vmlinux-6.1.166")
|
||||
c.Rootfs = envOr("FC_ROOTFS", c.BaseDir+"/rootfs.ext4")
|
||||
return c
|
||||
}
|
||||
|
||||
@@ -55,7 +55,7 @@ func (o *Orchestrator) Init() error {
|
||||
|
||||
// Download kernel if missing
|
||||
if _, err := os.Stat(o.cfg.Kernel); os.IsNotExist(err) {
|
||||
url := "https://s3.amazonaws.com/spec.ccfc.min/ci-artifacts/kernels/x86_64/vmlinux-6.1.bin"
|
||||
url := o.cfg.KernelURL
|
||||
o.log.Infof("downloading kernel from %s ...", url)
|
||||
if err := downloadFile(url, o.cfg.Kernel); err != nil {
|
||||
return fmt.Errorf("download kernel: %w", err)
|
||||
@@ -81,15 +81,18 @@ func (o *Orchestrator) buildRootfs() error {
|
||||
mnt := filepath.Join(o.cfg.BaseDir, "mnt")
|
||||
|
||||
// create empty ext4 image
|
||||
o.log.Infof("running: dd if=/dev/zero of=%s bs=1M count=%d status=none", o.cfg.Rootfs, sizeMB)
|
||||
if err := run("dd", "if=/dev/zero", "of="+o.cfg.Rootfs,
|
||||
"bs=1M", fmt.Sprintf("count=%d", sizeMB), "status=none"); err != nil {
|
||||
return err
|
||||
}
|
||||
o.log.Infof("running: mkfs.ext4 -qF %s", o.cfg.Rootfs)
|
||||
if err := run("mkfs.ext4", "-qF", o.cfg.Rootfs); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
os.MkdirAll(mnt, 0o755)
|
||||
o.log.Infof("running: mount -o loop %s %s", o.cfg.Rootfs, mnt)
|
||||
if err := run("mount", "-o", "loop", o.cfg.Rootfs, mnt); err != nil {
|
||||
return err
|
||||
}
|
||||
@@ -106,6 +109,7 @@ func (o *Orchestrator) buildRootfs() error {
|
||||
if err := downloadFile(url, tarPath); err != nil {
|
||||
return fmt.Errorf("download alpine: %w", err)
|
||||
}
|
||||
o.log.Infof("running: tar xzf %s -C %s", tarPath, mnt)
|
||||
if err := run("tar", "xzf", tarPath, "-C", mnt); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user