World

Sandbox Execution

Status: Design landing ADRs: ADR-021

What the sandbox is

Inklings embeds a Wasmtime-sandboxed CPython executor: a capability-restricted Python runtime that executes user- or agent-authored Python against workspace data without granting the executing code direct access to the host, the filesystem, or arbitrary network.

The sandbox exists to make safe execution of semi-trusted code a first-class operation in the product. Any surface that needs to run Python against workspace data — product features offering user-authored scripts, the World Agent invoking code-oriented tools, bounded computations over workspace content — can reach the sandbox without reimplementing isolation.

What the sandbox is not

The sandbox is not an agent runtime. The World Agent does not run inside Wasmtime. The World Agent runs in the Python sidecar on LangGraph, full stop.

Concretely:

There is one Python host for agent execution: the sidecar. Not two. The sandbox is not a second Python home for agent code.
The sandbox is not on the agent’s critical path. An agent turn that never invokes sandboxed code never touches the executor.
The sandbox does not orchestrate tool calls, plan, dispatch subagents, or carry agent state. Those responsibilities live in the sidecar and nowhere else.

The distinction is architectural, not terminological. Collapsing the sandbox into the runtime — or letting the runtime colonize the sandbox’s role — would produce either a single oversized Python host with no isolation between agent planning and sandboxed user code, or two competing Python hosts with ambiguous ownership of tool orchestration. Neither is acceptable.

See ADR-021 for the decision and hardening history.

How the sandbox is reached

The sandbox is exposed to the World Agent as an MCP-registered tool on the Rust side, per ADR-007 (Agent Integration via MCP and Sync).

The sidecar reaches the sandbox the same way it reaches any Rust tool: through the MCP bridge. There is no special agent-runtime path into the sandbox. There is no direct sidecar↔executor channel outside MCP. The sandbox is a tool; the agent invokes it by calling the tool.

This is the same arrangement every Rust-implemented capability has. It is not weakened for the sandbox’s benefit, and the sandbox does not receive agent-runtime privileges in return for being implemented in Python.

Capability mediation

Sandbox invocations inherit the caller’s resolved capability set from the permission system. Code executing in the sandbox cannot escalate: the capabilities available inside the sandbox are a function of the caller’s capabilities, not of the sandbox’s embedding.

When the World Agent invokes the sandbox on an authored schedule, the agent’s resolved capability set is the operating envelope. When a product surface invokes the sandbox on the user’s behalf, the user’s capability set is the envelope. In both cases, the executor’s own implementation cannot grant access the caller did not have. See ADR-008.

This is the connection point between the sandbox and the rest of the world model: the sandbox is gated by the same capability surface everything else is gated by.

Sandboxed writes go through the submit boundary

Any write to the workspace initiated by sandboxed code goes through the submit boundary with appropriate provenance:

When the agent invokes the sandbox as part of its work, writes originating from the sandbox’s execution are origin: agent-produced, because the sandbox is executing on the agent’s behalf and the caller identity at the boundary is the agent.
When a product surface invokes the sandbox on the user’s behalf, writes are origin: authored if the user is directly editing through the surface; they are classified per the calling surface’s rules otherwise.

Sandboxed code does not get to claim an origin of its own. It is not a caller identity in the provenance sense; it is execution substrate, and the caller identity is whatever party invoked the sandbox.

Lifecycle and the rest of the submit-boundary machinery apply unchanged. Sandboxed code that wants to create canonical content still cannot: agent-invoked writes land as candidate regardless of whether they originated from direct agent tool calls or from sandboxed computations the agent chose to run. See ADR-018.

When to use the sandbox (as the agent)

The sandbox is the right tool when:

A task is better served by running a small program over workspace data than by chaining many individual tool calls. (Computing aggregates over a large set of pages; running a consistency check expressible as a short program; performing a multi-step transformation on structured data the agent has already loaded.)
User-authored Python needs to run. (Skills expressed as scripts, user-defined data transformations, any surface where the product exposes “run this code for me.”)

The sandbox is not the right tool when:

The task is expressible through existing Rust tools at comparable cost. Tool calls are cheaper and clearer to the author watching the agent work.
The task is agent-planning work. Planning lives in the sidecar, not in sandboxed Python.

Relationship to the runtime

The sidecar and the sandbox are two independent Python environments with narrow, deliberate interaction:

	Python sidecar	Wasmtime-sandboxed CPython
Purpose	World Agent runtime (planning, tool orchestration, streaming, checkpointing)	Safe execution of bounded code against workspace data
Runs	LangGraph (D2)	User- or agent-authored Python
Host	OS-level Python process, sibling of Rust	Wasmtime-embedded CPython
Isolation	Process boundary from Rust	Wasmtime capability sandbox within Rust
Reached by agent	It is the agent	MCP tool call, like any other
Persists state across calls	Yes — the checkpointer (D23)	No — bounded execution, caller-scoped
Emits agent events	Yes — via the agent-event IPC (D4)	No — tool-call results only

They coexist in the same product because they solve different problems. The sidecar is about running an agent; the sandbox is about running code safely.

Why this matters for the rebuild

The world-model rebuild (see ADR-020) is explicit about what the new runtime replaces and what it does not. The sidecar replaces the legacy Rust agent harness. The sandbox is untouched: its product purpose is unchanged, its implementation is unchanged, and its place in the architecture is unchanged.

What changes is its framing. Older framing in codebase comments and adjacent docs sometimes referred to the sandbox as an “RLM executor” with implied agent-runtime semantics. That framing is wrong under the current architecture and should be treated as historical. The sandbox is a sandbox; the agent runtime is the sidecar; they are different things with different responsibilities.

What this page does not do

Does not describe the sandbox’s internal hardening decisions. See ADR-021.
Does not describe the MCP tool bridge. See the MCP documentation.
Does not describe the capability system’s resolution rules. See permission-system.
Does not describe specific sandbox tool signatures available to the agent. See the tool catalog.

Previous
Retroactive Revision Next
Submit Boundary

Was this page helpful?