Skip to content
Documentation GitHub
Platform

Sandbox Executor Security

Status: Implemented (Wasmtime sandbox live; capability inheritance via MCP tool bridge) Reference epics: INK-838 ADRs: ADR-016, ADR-021 Crate: crates/infrastructure/sandbox/


This page describes the security model of the Wasmtime-sandboxed CPython executor — the isolation boundary it provides, the threat model it addresses, and the defense-in-depth layers that enforce the boundary. It is a companion to sandbox-execution (which covers what the sandbox is for at the domain level) and mcp-system (which covers how the sandbox is exposed as a tool).

The executor described here is not where the World Agent runs. Per ADR-021, the sandbox is a distinct capability exposed to callers (including the World Agent) as an MCP-registered tool. The agent’s LangGraph runtime lives in the Python sidecar; the sandbox is a separate Python process spawned on demand by the MCP tool handler, isolated by Wasmtime, and destroyed after the call returns. The two have different security postures, different lifecycles, and different threat models.

This page covers the sandbox. Security considerations specific to the agent sidecar (process boundaries, IPC trust, prompt injection) live in process-model and prompt-injection-boundary.

The executor’s hardening work was originally framed around DSPy-compiled skill artifacts and a dspy_daemon Rust surface. That framing is retired. The executor today is a general-purpose sandbox invoked through MCP — it does not run compiled modules, does not broker LLM calls, and does not carry provider credentials. The threat-model layers below have been rewritten against the current architecture; see ADR-021 for the hardening history and current architectural role.


Every sandbox invocation is a fresh Wasmtime-hosted CPython instance spawned by the Rust-side sandbox MCP tool handler. The instance is spawned, runs the caller’s Python snippet, returns a result, and is destroyed. No reuse across invocations, no shared state, no long-running daemon.

+-------------------------------------+ +----------------------------------------+
| Rust host | | Wasmtime-hosted CPython instance |
| | | |
| +-------------------------------+ | | +----------------------------------+ |
| | OS keychain (API keys) | | | | No API keys | |
| | LLM provider SDKs | | | | No provider SDKs | |
| | Supabase auth tokens | | | | No auth tokens | |
| | Workspace SQLite (inklings.db)| | | | No direct filesystem access | |
| | Full capability set | | | | Scoped capability set (inherited) | |
| +-------------------------------+ | | +----------------------------------+ |
| | | |
| +-------------------------------+ | | +----------------------------------+ |
| | MCP tool handler |<-+-------+->| Back-channel to MCP server | |
| | (sandbox_execute) | | | | (calls gated by caller caps) | |
| +-------------------------------+ | | +----------------------------------+ |
| | | |
+-------------------------------------+ +----------------------------------------+
MCP invocation Wasmtime isolation

Two things make the boundary meaningful:

  1. The Wasmtime host intercepts every syscall. The sandboxed interpreter cannot open files, open sockets, spawn processes, or touch the filesystem outside what the host explicitly grants. The baseline grant is nothing.
  2. The MCP back-channel is the sandbox’s only path to workspace data. Sandboxed code that wants to read a page or write content calls an MCP tool through the back-channel, and that tool call is gated by the caller’s capability set at the Rust side — exactly as if the agent had called it directly. No sandbox-local privilege exists.

Every sandbox invocation carries a PermissionGuard at the MCP tool handler level. That guard becomes the effective capability set for any MCP back-channel call the sandboxed code makes. The guard can only be narrower than the caller’s resolved set — the MCP tool handler accepts an optional capability-subset argument so a caller can scope the sandbox more tightly (e.g., strip PagesWrite for a read-only computation), but the subset is always intersected with the caller’s actual guard. Sandboxed code cannot widen scope.

This is the same model described in permission-system.

Any write the sandboxed code produces — via an MCP back-channel write call — crosses the submit boundary exactly as any other agent write does. The adapter layer sets origin: AgentProduced (when the invoking caller is the agent) and lifecycle: Candidate by default. The sandbox does not get a side-path into workspace state.


Reality: the sandbox exists to run caller-provided code. This is not a threat to deny; it is the reason the sandbox exists. The threat is not “code ran” but “code ran and escaped the boundary or exceeded its scope.”

Mitigations:

MechanismCoverage
Wasmtime host-syscall denialNo filesystem, network, or subprocess access absent explicit host grant
No provider credentials presentEven if code could exfiltrate, there is nothing to exfiltrate
Capability-gated back-channelAny workspace reach goes through MCP with caller’s guard
Submit-boundary enforcementAny write produced by the sandbox constructs a WorldWrite like any other tool call

Threat: sandboxed code reads environment variables, on-disk config, or memory to capture API keys and exfiltrate them.

Mitigations:

MechanismCoverage
Credentials live only in RustThe sidecar receives credentials from Rust at sidecar startup; the sandbox instance receives none
env_clear() on spawnThe Wasmtime-hosted CPython starts with an empty environment
No network syscallsWasmtime denies outbound sockets at the host-call level
No filesystem accessWasmtime denies open, read, write at the host-call level

The sandbox cannot reach credentials because they are not in its address space and the host does not forward them.

T3: capability escalation via the back-channel

Section titled “T3: capability escalation via the back-channel”

Threat: sandboxed code constructs an MCP call claiming higher capability than the caller actually holds, or calls a capability-protected tool and expects the call to succeed because “the agent is calling.”

Mitigations:

MechanismCoverage
PermissionGuard is unforgeablePermissionGuard::new() is pub(crate) (see permission-system)
Guard resolved on Rust sideThe back-channel tool handler resolves from the invocation’s guard, not from any token sandboxed code could construct
Capability subset is intersection onlySandbox-specified capability subsets are intersected with the caller’s resolved set
Every tool call checks capabilityBack-channel invocations traverse the same guard.require() path as any MCP tool call

The sandbox has no guard-construction authority. It has whatever the outer handler hands it, narrower or equal.

Threat: sandboxed code writes to workspace state directly, bypassing the submit-boundary invariant.

Mitigations:

The sandbox has no direct workspace-storage access. Every write is an MCP call; every MCP write call constructs a WorldWrite through the submit-boundary adapter; and domain rule 1 holds in the Rust domain layer itself. There is no path around it.

Threat: sandboxed code consumes unbounded CPU, memory, or wall-clock time.

Mitigations:

MechanismCoverage
Wall-clock timeout per invocationHost kills the Wasmtime instance after a bounded deadline (default 60s)
Wasmtime fuel meteringCPU-like budget denominated in Wasm instructions; denies overruns
Memory limit on the Wasmtime instanceHost caps linear-memory growth per instance
No persistent processEach invocation is destroyed at completion; no cumulative drain
kill_on_drop on the host-side processHost cleanup releases the instance on tool-handler drop

Timeouts produce a structured SandboxError::Timeout returned to the caller. The agent sees this as a tool failure and handles it the same way it handles any other tool error.

Threat: sandboxed code writes crafted bytes to stdout to manipulate the MCP back-channel protocol.

Mitigations:

MechanismCoverage
MCP back-channel is message-framedEach message carries a correlation id and message type; unsolicited writes are ignored
Sequential protocolOne back-channel call at a time per sandbox instance
Validator at the host-side parserMalformed messages are rejected structurally

T7: Python interpreter / C-extension vulnerabilities

Section titled “T7: Python interpreter / C-extension vulnerabilities”

Threat: a CPython bug or a C extension in an allowed package allows a write outside the Wasmtime-managed linear memory.

Residual risk. Wasmtime is the defense of last resort: even a CPython memory-safety bug cannot reach host memory outside the instance. C extensions included in the baseline image are reviewed and pinned. Keeping the interpreter and allowed packages current is operational hygiene, not a control.


LayerMechanismWhat it prevents
Wasmtime host isolationHost denies syscalls; scoped linear memoryFilesystem access, network I/O, arbitrary host reach
Credential localityCredentials only in Rust; sandbox env clearedExfiltration of provider keys, auth tokens
Capability inheritancePermissionGuard at MCP tool handlerWorkspace reach exceeding caller’s resolved capabilities
Submit boundaryDomain invariant in Rust; no side-pathWrites bypassing provenance / lifecycle rules
Resource boundsTimeout, fuel, memory cap, kill_on_dropRunaway consumption of CPU / memory / time
MCP message framingCorrelation ids, sequential protocolBack-channel protocol manipulation

Each layer is load-bearing. No single layer is the sole boundary; a bypass of any one layer is caught by at least one other.


FileRole
crates/infrastructure/sandbox/src/wasmtime_host.rsWasmtime host configuration: syscall denial, fuel, memory caps
crates/infrastructure/sandbox/src/mcp_backchannel.rsBack-channel protocol handler with capability-gated MCP dispatch
crates/application/src/sandbox/execute_use_case.rsSandboxExecuteUseCase — guard check, instance spawn, result shape
crates/infrastructure/mcp-server/src/tools/sandbox_execute.rsMCP tool registration and invocation-time capability-subset handling

A sandbox MCP tool call carries:

  • snippet — the Python source to execute
  • capability_subset — optional subset of capabilities to scope this invocation (intersected with caller’s guard)
  • timeout_ms — optional wall-clock bound (capped at host maximum)
  • fuel — optional Wasm-instruction budget (capped at host maximum)
  • context — caller-supplied structured inputs made available to the snippet

It returns:

  • result — the snippet’s final expression value, serialized
  • stdout / stderr — captured output for the caller to surface (stdout is not used for IPC)
  • error — structured failure (timeout, fuel exhaustion, uncaught exception, capability denial)
  • Does not persist state. Every invocation is fresh. Callers that need persistent state write it to workspace content through the submit boundary.
  • Does not load plugins. The baseline Python image is fixed. Capability is expressed through the host’s syscall surface, not through loaded extensions.
  • Does not broker LLM calls. The sidecar holds provider SDKs and credentials. Sandboxed code that needs an LLM call invokes a dedicated MCP tool from the sidecar’s LLM surface; the sandbox never speaks provider APIs directly.
  • Does not orchestrate agent work. No subagent dispatch, no checkpointing, no planning. Those live in the sidecar’s LangGraph runtime.

Was this page helpful?