The Guest Rootfs
The custom initramfs, the overlayfs root, and the initramfs-style switch_root that emberd-init performs as PID 1.
When a sandbox boots, the kernel runs emberd's own initramfs — a tiny cpio archive
whose /init is a statically-linked emberd-init. As PID 1, that binary builds
the real root filesystem and then serves the control plane.
The initramfs
rootfs/build.sh produces ~/firecracker-verify/emberd-initramfs.cpio:
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
go build -trimpath -ldflags="-s -w" -o "$STAGE/init" ./cmd/emberd-init
mkdir -p "$STAGE"/{proc,sys,dev,newroot,tmp} # mountpoints emberd-init needs
( cd "$STAGE" && find . | cpio --create --format=newc ) > emberd-initramfs.cpioBuilding with CGO_ENABLED=0 means the binary is fully static — the initramfs
carries no libc, no shell, nothing but /init and a handful of empty
mountpoints (~2.5 MiB total). Everything else comes from the language pack.
The overlay root
The architecture wants a writable root without mutating the immutable base image,
so emberd-init assembles an overlayfs:
- lower — the language-pack squashfs, mounted read-only.
- upper + work — a fresh tmpfs (per-VM, lives in RAM).
- merged — the overlay of the two; this becomes the new root.
Writes land in the tmpfs upper; the squashfs is never touched and its pages are shareable across sandboxes. Reset semantics are trivial: the tmpfs dies with the VM.
The boot sequence
bootstrapPID1() runs only when os.Getpid() == 1 (so host-side unit tests of
the same binary skip it):
- Mount
procon/proc, devtmpfs on/dev(this is what makes/dev/vdaappear), andsysfson/sys— all inside the initramfs. - Mount the squashfs (
/dev/vda) read-only at/lower. - Mount a tmpfs at
/overlay; createupper/andwork/on it. mount -t overlaywithlowerdir=/lower,upperdir=/overlay/upper,workdir=/overlay/workat/newroot.MS_MOVE/proc,/dev,/sysinto/newroot(carrying their subtrees).- switch_root into
/newroot. - Set a default
PATHandHOME(PID 1 starts with an empty environment, sopython3isn't resolvable until this is done). - Read
emberd.interpreter=from/proc/cmdline(see language packs). - Start the child reaper (a
SIGCHLDhandler). As PID 1,emberd-initinherits any process a workload double-forks; the reaperwait4s them so they don't leak as zombies, while still lettingrunExeccollect the interpreter's own exit code. - Serve the vsock control plane.
Why switch_root and not pivot_root?
pivot_root(2) is rejected on an initramfs (EINVAL). So emberd uses the
busybox switch_root technique instead:
syscall.Chdir(newRoot) // cd /newroot
syscall.Mount(".", "/", "", syscall.MS_MOVE, "") // move merged root onto /
syscall.Chroot(".") // chroot into it
syscall.Chdir("/")Moving the /proc, /dev, /sys submounts into /newroot before the final
MS_MOVE means they ride along when the whole subtree relocates to /.
You can confirm it worked from inside a running sandbox — /proc/mounts shows:
overlay / overlay rw,relatime,lowerdir=/lower,upperdir=/overlay/upper,workdir=/overlay/work 0 0/ is the overlay, and writes anywhere (e.g. /root, /etc via copy-up) succeed.
Kernel requirements
The guest kernel needs SQUASHFS, OVERLAY_FS, DEVTMPFS, TMPFS, and
VIRTIO_BLK (plus the vsock options from the control-plane page). The Firecracker
CI kernel has all of them built in, verified by extracting its embedded config —
which is why emberd boots on the stock kernel with no rebuild.
The kernel boot args still include panic=1, so if emberd-init ever exits or
the bootstrap fails, the guest panics and the VM dies — which is the correct
outcome for a sandbox whose agent is gone. The VM stays alive purely because PID
1 (emberd-init) blocks forever in its vsock accept loop; nothing on the host
needs to hold the VMM's stdin open (it's wired to /dev/null).