Pre-alpha — APIs, wire formats, and behavior may change without notice. Expect breaking changes; use with caution.
emberd

Design Notes

The locked-in decisions and the trade-offs behind them — a summary of the running design log.

emberd keeps a running design log in docs/implementation-notes.html (decisions, deviations, trade-offs, open questions, each dated). This page summarizes the decisions that shape the system today.

Locked-in decisions

DecisionChoiceWhy
Isolation primitiveFirecracker microVMReal KVM isolation; the primitive Lambda/Fly/Modal/E2B use. Credible path to <100ms via snapshots.
LanguageGo + firecracker-go-sdkMature AWS-maintained SDK; fast path to v0.1; approachable for contributors.
APIHTTP REST, 127.0.0.1:7777Simple, debuggable with curl, no codegen. gRPC unlikely.
Rootfsread-only squashfs + tmpfs overlayShared base pages, trivial reset, smaller snapshots — the Modal/E2B pattern.
Control planevsock, not virtual networkingKeeps "no network" honest; no IP stack needed in the guest for control.
Wire formatlength-prefixed JSONTiny dependency surface, debuggable, matches the REST shapes.
First language packPythonDominant target for agent tool calls.
Default network policynoneDefault-off is safer; adding egress later is easier than locking down.
LicenseMITSimple permissive terms.

Trade-offs worth knowing

  • Firecracker over gVisor. Heavier to operate (KVM, Linux-only, extra binary) but an unambiguously stronger boundary and a better cold-start story. emberd is Linux-only by intent, so the portability gVisor would buy doesn't matter.
  • Snapshot restore vs cold boot. v0.1 cold-boots (~125ms VMM + rootfs init + interpreter warmup). Snapshot restore (5–30ms) is the only credible path to sub-100ms and is the v0.2 target; the cost is large per-pack snapshots that need versioning.
  • Per-sandbox VM vs warm pool. Per-sandbox is the correct, simplest baseline. A warm pool (constant-time acquire) comes later, with a "is this really clean?" verification step.

Notable deviations (and how they resolved)

  • chroot → overlayfs. The first live-exec build took a shortcut: read-only chroot with a tmpfs only on /tmp. It was later replaced the same day with the intended overlayfs lower/upper + switch_root, so the whole guest root is now writable scratch. See the guest rootfs.
  • net.FileConn → raw fd. Go can't wrap an AF_VSOCK fd, so the guest reads and writes the raw descriptor directly. See the control plane.
  • PID 1 has no $PATH. Discovered when python3 wasn't resolvable inside the guest; emberd-init now sets a default PATH/HOME during bootstrap.

Open questions

These are deliberately unresolved; they'll be decided when they become relevant:

  • Threat model. Buggy code from trusted agents, intentionally malicious code via prompt injection, or both? Decides how aggressive jailer/seccomp hardening needs to be.
  • Deployment shape. System daemon (one per machine, multi-tenant by sandbox), embedded library (one per agent process), or both? Affects API stability.
  • Resource-limit units. CPU as millicpu / vcpu / host shares? Sensible defaults?
  • Second language pack. Node next, or harden Python first?

On this page