Sandbox interface, so your adapter code works identically regardless of whether the environment is a local Docker container, a Vercel micro-VM, or a third-party cloud service. You choose a backend at the CLI or in config; the adapter never needs to know which one is active.
The Sandbox interface
Every backend implements the same interface. These are the only operations an adapter ever calls:
Root vs non-root: why non-root is the default
Commands run as a non-root user by default. This matches the agent’s natural operating environment and, critically, it is required for Claude Code: the CLI refuses to run with--dangerously-skip-permissions when it detects it is executing as root.
When you need elevated privileges — for example, to install a system package during eval setup — pass { root: true } to runCommand. Use it only for setup commands; the agent itself and all validation should run without it.
root: true semantics are consistent across every backend:
| Backend | Default user | { root: true } mapping |
|---|---|---|
| Docker | node (UID 1000) | docker exec --user root |
| E2B | user (non-root) | commands.run(cmd, { user: "root" }) |
| Vercel Sandbox | vercel-sandbox (non-root) | runCommand(cmd, { sudo: true }) |
| Daytona | configured at create time | per-command user override |
| Modal | root by default | no-op (already root) |
Available backends
- Docker (default)
- Vercel
- Auto
- Third-party
Docker is the default backend and requires no cloud credentials — only a local Docker installation. It is the right choice for local development and most CI pipelines.How it works:
- Starts a
node:24-slimcontainer runningsleep infinity - Runs all commands via
docker exec(withAutoRemoveon stop) - Default user is
node(UID 1000); global npm packages install to the user directory and are added toPATH - The slim base image is bootstrapped with
ca-certificatesandgit - Files are uploaded using tar +
putArchive, with achownpass to fix ownership - Docker’s multiplexed exec stream (8-byte frame header) is parsed correctly
Selecting a backend
You can select the backend on the CLI, in config, or by relying on auto-detection:Docker backend details
The Docker backend is zero-config and handles all the quirks of running a coding agent as a non-root user:- Base image:
node:24-slim - Default user:
node(UID 1000) — matches the user Claude Code expects when--dangerously-skip-permissionsis used - Global npm installs: because the non-root user cannot write to
/usr/local/lib, niceeval configures npm to install globals into the user’s home directory and prepends that directory toPATH - Slim image bootstrap:
apt-get install ca-certificates gitruns automatically on first use - File uploads: uses Docker’s
putArchiveAPI (tar format) followed by achownto restore correct ownership after the root-owned write - Stream parsing: Docker’s exec API multiplexes stdout and stderr on a single stream with an 8-byte frame header; niceeval parses this correctly so you always get clean stdout and stderr separately
Vercel backend details
The Vercel backend requires one of:VERCEL_TOKEN— a personal access token from your Vercel account settingsVERCEL_OIDC_TOKEN— an OIDC token, suitable for CI environments with Vercel’s OIDC integration
niceeval.config.ts — no adapter code changes required.
Performance: warm pools and sandbox reuse
Sandbox cold-start time is the dominant latency factor in large eval runs. niceeval offers two mechanisms to address it:Warm pool
niceeval pre-creates a pool of sandboxes before any eval runs. When a case starts, it claims an already-running sandbox instead of waiting for a cold boot. Cold-start cost moves off the critical path entirely.
Sandbox reuse
After a case finishes, the sandbox can be reset with
git clean back to the baseline state and handed to the next case instead of being destroyed. This trades a small contamination risk for significantly faster throughput. Reuse is off by default; enable it in your runner config when speed matters more than absolute isolation.create and reset operations — the scheduling logic lives in niceeval core.