Platform

Sandbox Executor Security

Status: Implemented (Wasmtime sandbox live; capability inheritance via MCP tool bridge) Reference epics: INK-838 ADRs: ADR-016, ADR-021 Crate: crates/infrastructure/sandbox/

What this page covers

This page describes the security model of the Wasmtime-sandboxed CPython executor — the isolation boundary it provides, the threat model it addresses, and the defense-in-depth layers that enforce the boundary. It is a companion to sandbox-execution (which covers what the sandbox is for at the domain level) and mcp-system (which covers how the sandbox is exposed as a tool).

The sandbox is not the agent runtime

The executor described here is not where the World Agent runs. Per ADR-021, the sandbox is a distinct capability exposed to callers (including the World Agent) as an MCP-registered tool. The agent’s LangGraph runtime lives in the Python sidecar; the sandbox is a separate Python process spawned on demand by the MCP tool handler, isolated by Wasmtime, and destroyed after the call returns. The two have different security postures, different lifecycles, and different threat models.

This page covers the sandbox. Security considerations specific to the agent sidecar (process boundaries, IPC trust, prompt injection) live in process-model and prompt-injection-boundary.

Historical note

The executor’s hardening work was originally framed around DSPy-compiled skill artifacts and a dspy_daemon Rust surface. That framing is retired. The executor today is a general-purpose sandbox invoked through MCP — it does not run compiled modules, does not broker LLM calls, and does not carry provider credentials. The threat-model layers below have been rewritten against the current architecture; see ADR-021 for the hardening history and current architectural role.

The isolation boundary

Every sandbox invocation is a fresh Wasmtime-hosted CPython instance spawned by the Rust-side sandbox MCP tool handler. The instance is spawned, runs the caller’s Python snippet, returns a result, and is destroyed. No reuse across invocations, no shared state, no long-running daemon.

+-------------------------------------+       +----------------------------------------+
|              Rust host              |       |    Wasmtime-hosted CPython instance    |
|                                     |       |                                        |
|  +-------------------------------+  |       |  +----------------------------------+  |
|  | OS keychain (API keys)       |  |       |  | No API keys                      |  |
|  | LLM provider SDKs             |  |       |  | No provider SDKs                 |  |
|  | Supabase auth tokens          |  |       |  | No auth tokens                   |  |
|  | Workspace SQLite (inklings.db)|  |       |  | No direct filesystem access      |  |
|  | Full capability set           |  |       |  | Scoped capability set (inherited) |  |
|  +-------------------------------+  |       |  +----------------------------------+  |
|                                     |       |                                        |
|  +-------------------------------+  |       |  +----------------------------------+  |
|  | MCP tool handler              |<-+-------+->| Back-channel to MCP server      |  |
|  | (sandbox_execute)             |  |       |  | (calls gated by caller caps)    |  |
|  +-------------------------------+  |       |  +----------------------------------+  |
|                                     |       |                                        |
+-------------------------------------+       +----------------------------------------+
         MCP invocation                                  Wasmtime isolation

Two things make the boundary meaningful:

The Wasmtime host intercepts every syscall. The sandboxed interpreter cannot open files, open sockets, spawn processes, or touch the filesystem outside what the host explicitly grants. The baseline grant is nothing.
The MCP back-channel is the sandbox’s only path to workspace data. Sandboxed code that wants to read a page or write content calls an MCP tool through the back-channel, and that tool call is gated by the caller’s capability set at the Rust side — exactly as if the agent had called it directly. No sandbox-local privilege exists.

Capability inheritance

Every sandbox invocation carries a PermissionGuard at the MCP tool handler level. That guard becomes the effective capability set for any MCP back-channel call the sandboxed code makes. The guard can only be narrower than the caller’s resolved set — the MCP tool handler accepts an optional capability-subset argument so a caller can scope the sandbox more tightly (e.g., strip PagesWrite for a read-only computation), but the subset is always intersected with the caller’s actual guard. Sandboxed code cannot widen scope.

This is the same model described in permission-system.

Submit-boundary consistency

Any write the sandboxed code produces — via an MCP back-channel write call — crosses the submit boundary exactly as any other agent write does. The adapter layer sets origin: AgentProduced (when the invoking caller is the agent) and lifecycle: Candidate by default. The sandbox does not get a side-path into workspace state.

Threat model

T1: arbitrary code running in the sandbox

Reality: the sandbox exists to run caller-provided code. This is not a threat to deny; it is the reason the sandbox exists. The threat is not “code ran” but “code ran and escaped the boundary or exceeded its scope.”

Mitigations:

Mechanism	Coverage
Wasmtime host-syscall denial	No filesystem, network, or subprocess access absent explicit host grant
No provider credentials present	Even if code could exfiltrate, there is nothing to exfiltrate
Capability-gated back-channel	Any workspace reach goes through MCP with caller’s guard
Submit-boundary enforcement	Any write produced by the sandbox constructs a `WorldWrite` like any other tool call

T2: credential exfiltration

Threat: sandboxed code reads environment variables, on-disk config, or memory to capture API keys and exfiltrate them.

Mitigations:

Mechanism	Coverage
Credentials live only in Rust	The sidecar receives credentials from Rust at sidecar startup; the sandbox instance receives none
`env_clear()` on spawn	The Wasmtime-hosted CPython starts with an empty environment
No network syscalls	Wasmtime denies outbound sockets at the host-call level
No filesystem access	Wasmtime denies `open`, `read`, `write` at the host-call level

The sandbox cannot reach credentials because they are not in its address space and the host does not forward them.

T3: capability escalation via the back-channel

Threat: sandboxed code constructs an MCP call claiming higher capability than the caller actually holds, or calls a capability-protected tool and expects the call to succeed because “the agent is calling.”

Mitigations:

Mechanism	Coverage
`PermissionGuard` is unforgeable	`PermissionGuard::new()` is `pub(crate)` (see permission-system)
Guard resolved on Rust side	The back-channel tool handler resolves from the invocation’s guard, not from any token sandboxed code could construct
Capability subset is intersection only	Sandbox-specified capability subsets are intersected with the caller’s resolved set
Every tool call checks capability	Back-channel invocations traverse the same `guard.require()` path as any MCP tool call

The sandbox has no guard-construction authority. It has whatever the outer handler hands it, narrower or equal.

T4: submit-boundary bypass

Threat: sandboxed code writes to workspace state directly, bypassing the submit-boundary invariant.

Mitigations:

The sandbox has no direct workspace-storage access. Every write is an MCP call; every MCP write call constructs a WorldWrite through the submit-boundary adapter; and domain rule 1 holds in the Rust domain layer itself. There is no path around it.

T5: resource exhaustion

Threat: sandboxed code consumes unbounded CPU, memory, or wall-clock time.

Mitigations:

Mechanism	Coverage
Wall-clock timeout per invocation	Host kills the Wasmtime instance after a bounded deadline (default 60s)
Wasmtime fuel metering	CPU-like budget denominated in Wasm instructions; denies overruns
Memory limit on the Wasmtime instance	Host caps linear-memory growth per instance
No persistent process	Each invocation is destroyed at completion; no cumulative drain
`kill_on_drop` on the host-side process	Host cleanup releases the instance on tool-handler drop

Timeouts produce a structured SandboxError::Timeout returned to the caller. The agent sees this as a tool failure and handles it the same way it handles any other tool error.

T6: IPC injection via stdout

Threat: sandboxed code writes crafted bytes to stdout to manipulate the MCP back-channel protocol.

Mitigations:

Mechanism	Coverage
MCP back-channel is message-framed	Each message carries a correlation id and message type; unsolicited writes are ignored
Sequential protocol	One back-channel call at a time per sandbox instance
Validator at the host-side parser	Malformed messages are rejected structurally

T7: Python interpreter / C-extension vulnerabilities

Threat: a CPython bug or a C extension in an allowed package allows a write outside the Wasmtime-managed linear memory.

Residual risk. Wasmtime is the defense of last resort: even a CPython memory-safety bug cannot reach host memory outside the instance. C extensions included in the baseline image are reviewed and pinned. Keeping the interpreter and allowed packages current is operational hygiene, not a control.

Defense-in-depth summary

Layer	Mechanism	What it prevents
Wasmtime host isolation	Host denies syscalls; scoped linear memory	Filesystem access, network I/O, arbitrary host reach
Credential locality	Credentials only in Rust; sandbox env cleared	Exfiltration of provider keys, auth tokens
Capability inheritance	`PermissionGuard` at MCP tool handler	Workspace reach exceeding caller’s resolved capabilities
Submit boundary	Domain invariant in Rust; no side-path	Writes bypassing provenance / lifecycle rules
Resource bounds	Timeout, fuel, memory cap, `kill_on_drop`	Runaway consumption of CPU / memory / time
MCP message framing	Correlation ids, sequential protocol	Back-channel protocol manipulation

Each layer is load-bearing. No single layer is the sole boundary; a bypass of any one layer is caught by at least one other.

Implementation

Key files

File	Role
`crates/infrastructure/sandbox/src/wasmtime_host.rs`	Wasmtime host configuration: syscall denial, fuel, memory caps
`crates/infrastructure/sandbox/src/mcp_backchannel.rs`	Back-channel protocol handler with capability-gated MCP dispatch
`crates/application/src/sandbox/execute_use_case.rs`	`SandboxExecuteUseCase` — guard check, instance spawn, result shape
`crates/infrastructure/mcp-server/src/tools/sandbox_execute.rs`	MCP tool registration and invocation-time capability-subset handling

Invocation shape

A sandbox MCP tool call carries:

snippet — the Python source to execute
capability_subset — optional subset of capabilities to scope this invocation (intersected with caller’s guard)
timeout_ms — optional wall-clock bound (capped at host maximum)
fuel — optional Wasm-instruction budget (capped at host maximum)
context — caller-supplied structured inputs made available to the snippet

It returns:

result — the snippet’s final expression value, serialized
stdout / stderr — captured output for the caller to surface (stdout is not used for IPC)
error — structured failure (timeout, fuel exhaustion, uncaught exception, capability denial)

What the sandbox deliberately does not do

Does not persist state. Every invocation is fresh. Callers that need persistent state write it to workspace content through the submit boundary.
Does not load plugins. The baseline Python image is fixed. Capability is expressed through the host’s syscall surface, not through loaded extensions.
Does not broker LLM calls. The sidecar holds provider SDKs and credentials. Sandboxed code that needs an LLM call invokes a dedicated MCP tool from the sidecar’s LLM surface; the sandbox never speaks provider APIs directly.
Does not orchestrate agent work. No subagent dispatch, no checkpointing, no planning. Those live in the sidecar’s LangGraph runtime.

Sandbox execution — what the sandbox is for, at the domain level
MCP system — how the sandbox is exposed as a tool
Permission system — capability resolution and inheritance
Submit boundary — the write-path invariant every sandbox write traverses
Process model — the Rust ↔ sidecar ↔ sandbox process topology
Prompt injection boundary — adjacent but different: security at the agent’s language-model boundary, not the sandbox boundary
ADR-016: World Runtime on LangGraph — why the agent runtime is the sidecar, not the sandbox
ADR-021: Sandbox Distinct From Runtime — the architectural role and hardening history

Previous
Permission System Next
Search System

Was this page helpful?

Sandbox Executor Security

What this page covers

The sandbox is not the agent runtime

Historical note

The isolation boundary

Capability inheritance

Submit-boundary consistency

Threat model

T1: arbitrary code running in the sandbox

T2: credential exfiltration

T3: capability escalation via the back-channel

T4: submit-boundary bypass

T5: resource exhaustion

T6: IPC injection via stdout

T7: Python interpreter / C-extension vulnerabilities

Defense-in-depth summary

Implementation

Key files

Invocation shape

What the sandbox deliberately does not do

Related