Agent Core System
Status: Design landing Reference epics: INK-825, INK-833 ADRs: ADR-016, ADR-022
The agent core is the runtime the World Agent runs on. This page describes what that runtime is made of, where it lives in the process graph, and how turns, subagents, and interrupts compose.
One runtime, one shape
Section titled “One runtime, one shape”There is one agent runtime. It is a LangGraph application hosted inside the Python sidecar process. A turn is a run of that graph against a checkpointed thread. Subagents are subgraphs dispatched from the main graph. Interrupts are first-class: a node can pause the graph and return control to the caller, who later resumes with a value.
No other agent runtime exists. The sidecar is not a thin shim around a Rust harness. Rust does not run agent loops. The pattern is uniform across conversational runs, scheduled runs, and subagent dispatch: the same graph, the same checkpointer, the same tool surface.
This is the full consequence of ADR-016: the world runtime is LangGraph-in-sidecar, directly, with no intermediating harness.
Where it lives in the process graph
Section titled “Where it lives in the process graph”The agent core is code inside the Python sidecar. The sidecar is one of the processes described in the Process Model; the agent runtime is the main thing it hosts. Everything else in the sidecar — tool clients, memory clients, LLM SDKs — exists to serve the graph.
From outside the sidecar:
- The Rust host speaks to the agent through the sidecar’s IPC surface: start a turn on a thread, resume an interrupted turn, cancel a turn, observe streaming output.
- Tools reach the sidecar through the MCP server (see MCP System). From the graph’s perspective, every Rust capability — workspace reads and writes, sandbox execution, permission checks, memory reads — is an MCP tool call.
The graph does not know about Rust directly. The Rust host does not know about LangGraph directly. The boundary is IPC on one side and MCP on the other.
What a run is
Section titled “What a run is”A run is one execution of the main graph against a thread_id. The thread is the persistence key: all memory for the run, all checkpoints of graph state, all tool-call history, all in-flight interrupts live under that thread.
A turn is scoped to a conversation (Conversation System) or to a scheduled task (Scheduling System). In both cases the thread_id identifies the conversation state the run operates on. The graph’s state is checkpointed to inklings.db through the LangGraph SQLite checkpointer — the same database that holds conversations and workspace content. There is no separate agents.db.
A run proceeds until one of four things happens:
- The graph reaches an
ENDnode. The final assistant turn is emitted; memory is flushed; the thread is left in a resumable state for the next turn. - A node raises an interrupt. The graph’s state is checkpointed; control returns to the caller with the interrupt payload. The next call on the thread resumes via
Command(resume=…). - The caller cancels. The current node is interrupted at the next await; state is checkpointed where it is; the thread remains resumable.
- A node errors. The error is recorded on the thread; the turn ends; the thread remains resumable from the last successful checkpoint.
Cancel, error, and natural completion all leave the thread in a consistent state. Resuming from any of them is the same mechanism.
Graph shape
Section titled “Graph shape”The main graph has a small, stable shape. The interesting variety is inside nodes and inside subgraphs, not in the top-level topology.
- Entry. Accepts the turn input (user message, scheduled-task payload, resume value) and the thread’s current state.
- Context node. Loads the relevant slice of the four-tier memory for this turn. This is not a pipeline: it is a single node that reads conversation state, channel context, workspace context, and account context through MCP memory tools, and places them into the graph’s state channels.
- Planner node. The LLM node that decides what to do next — answer, call a tool, dispatch a subagent, or ask the author. Uses the LLM System for provider access.
- Tool node. Executes a tool call selected by the planner. All tools are MCP tools; all calls go through the MCP client. Tools that cross the Submit Boundary route through the MCP registration metadata described in MCP System.
- Dispatch node. Sends work to a subagent subgraph via LangGraph
Send. Waits for the subagent’s terminal state. - Interrupt node. Raises a graph interrupt when an author decision is required — a destructive action, a capability grant, a candidate promotion.
- Persist node. Writes assistant output into the conversation, writes any memory updates, and closes the turn.
The planner loops back to itself through the tool and dispatch nodes until it decides the turn is done. There is no hand-coded state machine describing which tool follows which: the graph’s edges encode “planner chose tool X → run tool node → back to planner.”
No process types
Section titled “No process types”There is no Orchestrator, no Worker, no Researcher, no Librarian, no Archivist, no Trainer, no TeamLead. The old framing where the agent was a dispatcher that spawned typed child processes is gone.
The replacement is: one graph. Specialized behavior lives in:
- Subagent subgraphs. Named graphs invoked via
Sendfor scoped work — a research sweep, a composition pass, a consolidation pass. The Skill System describes how the Skill Composer is built as a subagent subgraph. - Tools. Specialized capabilities — sandbox execution, bulk reads, structured extraction — exposed as MCP tools.
A subagent subgraph gets its own sub-thread for checkpointing, but the work is still just a graph run in the sidecar. It is not a separate process, not a separate runtime, not a separate agent class.
No Rust agent harness
Section titled “No Rust agent harness”The agent runtime was previously described as a Rust harness hosting a Python optimizer. That framing is gone.
- There is no
infrastructure-agent-corecrate. There is noinfrastructure-agent-harnesscrate. Rust does not host an agent loop. - There is no
agent_loop.rs, noToolSourceenum in Rust adjudicating what the agent may call. Rust exposes capabilities as MCP tools; the MCP client in the sidecar is what the agent sees. - There is no PyInstaller-packaged optimizer as a module the Rust harness drives. The sidecar is the Python process, the sidecar runs the graph, the sidecar is the agent host.
The only shim that remains between Rust and the agent is the MCP bridge for Rust-backed tools — and that shim lives inside the sidecar. See MCP System.
Checkpointing and persistence
Section titled “Checkpointing and persistence”Graph state is checkpointed to inklings.db via the LangGraph SQLite checkpointer. Checkpoints are keyed by thread_id and include:
- Graph state channels (context, scratchpad, tool results, planner decisions).
- The in-flight node and edge cursor.
- Any pending interrupt payload.
Conversations, scheduled tasks, and subagent sub-threads all checkpoint into the same database. This is what makes cancel, resume, and crash recovery the same code path: the thread is always persisted; a run is always resumable.
The Agent Memory System describes the four-tier memory surface on top of the checkpointer. The checkpointer itself is the low-level substrate — it records the shape of the run; memory is how the run sees the world.
Interrupts
Section titled “Interrupts”Interrupts are how the graph asks an author for a decision mid-run. A tool or a planner node raises a typed interrupt with a payload describing the question. The run checkpoints; the IPC response tells the Rust host “interrupted with payload X.” The UI presents the decision. The author’s choice comes back as a Command(resume=…) on the thread.
The same mechanism handles:
- Permission escalations (see Submit Boundary for the domain-invariant refusal; the permission prompt itself is an interrupt).
- Author-gated destructive actions (bulk deletes, re-validations at scale).
- Proposed-canonical-promotion decisions.
- Any point where the graph refuses to proceed without author input.
Interrupts are not a separate runtime feature added on top; they are the same graph mechanism as any other node returning control.
Scratchpad as state channel
Section titled “Scratchpad as state channel”Working context during a run lives in a graph state channel — the “scratchpad.” This is a LangGraph state reducer, not a separate data structure, not an in-memory buffer owned by the old Orchestrator, not an AGENTS.md file in a virtual filesystem.
The scratchpad is ephemeral: it lives in the checkpoint for the duration of the turn and subsequent resumes, and is discarded when the turn completes. Anything the agent needs to carry across turns is promoted to the four-tier memory through memory tools, not by keeping the scratchpad alive.
What the runtime delegates
Section titled “What the runtime delegates”The core is thin. It delegates:
- Tool execution to MCP System. Rust capabilities, the sandboxed CPython executor, memory reads and writes — all through the same MCP surface.
- LLM calls to LLM System. The LLM programming layer is DSPy; sidecar nodes reach models through DSPy, not provider SDKs directly.
- Memory to Agent Memory System. Four tiers, accessed through MCP memory tools. The context node reads; planner decisions write.
- Scheduled runs to Scheduling System. The Rust task-runner fires triggers; each fire is just a run on a thread.
- Prompt-injection handling to Prompt-Injection Boundary. Classification of content-as-instructions is a sidecar-side concern between tool output and planner input.
- Skills to Skill System. Named, reusable prompt-and-tool-orchestrations, composed at runtime by the Skill Composer subagent.
The core provides the shape — graph, thread, checkpointer, subgraph dispatch, interrupts — and nothing else.
Why one runtime
Section titled “Why one runtime”The project had two competing framings — a Rust harness that drove a Python optimizer, and a Python runtime that directly hosted the agent. Keeping both would mean keeping two kinds of agent identity, two kinds of checkpoint, two ways to dispatch subagents, two places tool capabilities are enforced. Every new capability would have to decide which side it lives on, and every cross-cutting concern — memory, scheduling, interrupts — would need a version on each side.
Collapsing to one runtime is what makes the rest of the agent surface describable as a single system. There is one place a turn happens, one thread identity, one checkpointer, one dispatch mechanism, one way tools are reached.
Worked example
Section titled “Worked example”See Worked Example for how a full scenario traverses the runtime — context load, planner decision, tool call, submit-boundary write, and deviation emission — all as node transitions in the same graph.
What this page does not do
Section titled “What this page does not do”- It does not describe the Python sidecar process boundary or its IPC surface. See Process Model.
- It does not describe the
task()work-dispatch primitive. See Task Primitive. - It does not describe persistent-sandbox investigation runs and their subgraph shape. See Investigation Pattern.
- It does not describe the tool protocol or the sandboxed CPython executor. See MCP System.
- It does not describe memory tiers or consolidation. See Agent Memory System.
- It does not describe the Skill Composer subagent or skill authoring. See Skill System.
- It does not describe the World Agent’s voice, author-authority behavior, or control-plane semantics. See World Agent.
Was this page helpful?
Thanks for your feedback!