Agent Session
Status: Accepted Reference epics: INK-825, INK-830 ADRs: ADR-016, ADR-017
How a single World Agent session flows from a user prompt, through the Python sidecar’s LangGraph runtime, across MCP to Rust tools, through the submit boundary for any writes, into the checkpointer and the four-tier memory. The conceptual model for the agent itself lives in systems/world/world-agent; this page describes the runtime flow.
Overview
Section titled “Overview”An agent run has three stages: Prompt (the invocation reaches the sidecar), Graph (LangGraph-driven planning, subagent dispatch, and tool execution), and Persist (checkpoints and memory are written; no separate “orient” phase and no deterministic context-preparation step — retrieval is a tool the agent calls when it wants context).
Three framings the reader should carry through the rest of the page:
- There is no deterministic orient phase. The prior architecture ran a “Context Pipeline” (select then refine) that assembled context deterministically before the model was invoked. That phase is gone. Retrieval is a tool; when the agent wants context it calls a search tool, a page-read tool, or a memory tool, and the result becomes a tool-result message in the LangGraph state.
- The graph is the loop. There is no Orchestrator/Worker/Researcher process model. Planning, subagent dispatch, and tool calling are all nodes or edges in a LangGraph
StateGraph. Subagent dispatch uses LangGraph’sSend/ subgraph patterns; interrupts use LangGraph’sinterruptandCommand(resume=...). See ADR-016. - Writes always cross the submit boundary. Any tool call that modifies workspace content constructs a
WorldWritein the Rust domain and funnels through the submit-boundary validator before storage. The sidecar cannot bypass this — Python tools that need to write call a Rust tool over MCP.
Stage 1: Prompt
Section titled “Stage 1: Prompt”A user prompt reaches the sidecar through a chain of three boundaries:
| Boundary | Purpose | Source |
|---|---|---|
| Tauri IPC (frontend → Tauri command) | User input enters the Rust process | invoke("run_agent", { conversation_id, prompt }) |
| Agent-event IPC (Rust → sidecar) | Run is started on the Python side | start_run { thread_id, prompt } per ADR-016 |
| LangGraph invocation | The sidecar hands control to the graph | graph.astream({ messages: [HumanMessage(prompt)] }, { configurable: { thread_id } }) |
The thread_id is the continuity key: it scopes the LangGraph checkpoint, it names the conversation record in inklings.db, and it survives sidecar restarts. The Rust side preserves thread_id across the agent-event IPC boundary so that if the sidecar is restarted mid-run, resuming with the same thread_id picks up from the last checkpoint.
The agent-event IPC surface is narrow per ADR-016: it carries streamed graph events, interrupt and resume signals, workspace-event notifications, and lifecycle commands. Tool calls do not travel on this channel — they go over MCP.
Stage 2: Graph
Section titled “Stage 2: Graph”The LangGraph runtime drives a StateGraph whose state channels are typed per graph and checkpointed by LangGraph. The shape of a run is determined by the graph’s nodes and edges, not by a fixed process model.
The loop
Section titled “The loop”A typical planner-and-tools graph alternates between two kinds of step:
- A planner node produces either a final message or one or more tool-call requests. The planner is ordinary Python — a node function that takes state, makes a model call via a provider SDK, and returns a state update. No LangChain provider integrations are required; models are called directly by their native SDKs inside node bodies.
- A tool-execution step resolves tool calls. The runtime routes each tool call through the MCP bridge to the Rust tool host (see Stage 3 below), then writes the tool result back into the graph state.
After every tool execution (and at other designated points), LangGraph writes a checkpoint to langgraph_checkpoints / langgraph_writes in inklings.db (see database-schema). This is what makes a run resumable: a restarted sidecar reads the latest checkpoint for thread_id and continues.
Subagent dispatch
Section titled “Subagent dispatch”When the planner determines that part of the task is best handled by a specialized graph — for example, a skill-composition subgraph or an import-analysis subgraph — it dispatches via LangGraph’s Send primitive. The subgraph runs inside the same thread_id, producing its own checkpoint entries under a nested checkpoint_ns. There is no separate “subagent process.” Subgraphs are graphs; the runtime composes them without forking a process or spinning up a second sidecar.
Interrupts
Section titled “Interrupts”Human-in-the-loop steps use LangGraph’s interrupt primitive. The planner (or any node) raises __interrupt__ with a payload describing what it needs from the user. The agent-event IPC relays the interrupt to the frontend; the user’s response comes back as Command(resume=...) on the same IPC channel; LangGraph resumes execution from the last checkpoint. Because interrupt state is checkpointed, the sidecar can be restarted between the interrupt and the resume without losing context.
Retrieval and context
Section titled “Retrieval and context”The sidecar does not precompute context. When the agent wants context it calls a tool:
- Search tools (FTS5 over pages, dense-vector search over embeddings) return matching pages or blocks.
- Page/block read tools return the content of a specific page or block.
- Memory tools read from the four-tier
agent_memoryat the appropriate scope (account / workspace / channel / conversation).
Each of these is an MCP-bridged Rust tool. The planner decides what to fetch and when; the tool results become tool-result messages in the graph state; the context window is curated by the planner’s own reasoning, not by a deterministic infrastructure component.
Stage 3: Tool execution and the submit boundary
Section titled “Stage 3: Tool execution and the submit boundary”Tool execution has two paths depending on whether the tool writes to workspace content.
Read-only tools
Section titled “Read-only tools”Examples: search_pages, read_page, list_tags, get_memory. The MCP bridge dispatches to the Rust tool, which runs against the workspace (or against agent_memory for memory tools), and returns the result. No submit boundary involvement.
Boundary-crossing tools
Section titled “Boundary-crossing tools”Examples: create_page, update_block, add_derivation_link, resolve_deviation. These construct a WorldWrite in the Rust domain and route through the submit boundary. The sequence:
- The MCP bridge dispatches the tool call to the Rust tool.
- The Rust tool builds a
WorldWritevalue from the call arguments, populatingoriginfrom caller identity (a tool call from the sidecar carriesorigin: AgentProducedper domain rule 4),lifecyclefrom call context (agent writes default toCandidateper submit-boundary §agent-writes-default-to-candidate),origin_source_idfrom the tool identity, and derivation sources from the tool’s declared registration metadata. - The submit boundary validates the
WorldWriteagainst the domain invariants: origin consistent with caller, lifecycle a valid transition, derivation links internal to the workspace. Malformed writes are refused. - If the submission would conflict with existing canonical content, a
DeviationRecordis produced in the same transaction per domain rule 5. Capability denials are not deviations per domain rule 7. - Post-write side effects (embedding queue, event log, sync queue, re-validation flags for content derived from the affected source) fire through the
WriteEffectCoordinator. - The tool returns a
ToolResultover MCP that carries the applied write’s identity and any generated deviation record ids.
Python-native tools that need to write do so by calling an appropriate Rust tool over MCP. There is no Python path to workspace storage that bypasses the Rust domain. This is how ADR-017 holds across the language boundary.
Stage 4: Persist
Section titled “Stage 4: Persist”Two kinds of persistence happen during and after a run:
Checkpointer
Section titled “Checkpointer”The checkpointer captures graph-execution state — messages, state channels, interrupt state — keyed by thread_id. Writes happen after each tool execution and at any node boundary the graph author designates. The checkpointer is backed by inklings.db (not a sidecar-local file) so that agent state syncs with the workspace across devices per ADR-016.
Checkpointer writes are not submit-boundary crossings. They record the runtime’s internal state; they do not modify workspace content. “Agent working state is not workspace state” is preserved at the schema level — checkpoint tables and content tables share the database file but not the namespace.
Memory
Section titled “Memory”Memory is written by the agent calling a memory tool — it is a tool call, not a background phase. Typical patterns:
- End-of-turn observations stored at the conversation tier for short-term recall.
- Channel-scoped notes stored at the channel tier for continuity across conversations within a channel.
- Workspace-scoped facts stored at the workspace tier when the agent notices a durable pattern.
- Account-scoped preferences stored at the account tier (read/written via the account-level
agents.db).
The four-tier memory is the only memory system per ADR-016. There is no virtual filesystem, no AGENTS.md-pattern durable-context file, and no parallel long-term store. Consolidation and summarization between tiers — if wanted — are scheduled World Agent tasks, not runtime hooks.
Scheduled runs
Section titled “Scheduled runs”A scheduled World Agent task is the same graph driven from a different entry point. The task-runner fires the sidecar against a new thread_id (or an existing one, for resumable scheduled work) with a prompt that describes the task. The sidecar invokes the graph exactly as for a user-initiated run; checkpoints, tool calls, and submit-boundary crossings work identically. The only runtime difference is the origin of the prompt and the caller identity carried into WorldWrite construction — scheduled-task writes still receive origin: AgentProduced (agent writes are always agent-produced per domain rule 4) but are distinguishable in the event log by device_id and event-source metadata.
Workspace-event notifications (from Rust over the agent-event IPC) let scheduled tasks react to user activity without polling. See systems/agent/scheduling-system.
Key properties
Section titled “Key properties”| Property | Value |
|---|---|
| Runtime host | Python sidecar, single process per workspace |
| Execution substrate | LangGraph StateGraph |
| Continuity key | thread_id (survives sidecar restarts via checkpoint resume) |
| Tool transport | MCP (Rust tool host) |
| IPC between sidecar and Rust | Agent events, interrupts, workspace notifications, lifecycle; not tool calls |
| Write discipline | Every write constructs a WorldWrite and crosses the submit boundary |
| Context preparation | None deterministic; retrieval is a tool the agent calls |
| Process model | No Orchestrator/Worker/Researcher; graph nodes and subgraphs instead |
| Memory | Four-tier agent_memory via tool calls; no VFS, no AGENTS.md, no parallel store |
| Checkpointer | langgraph_checkpoints / langgraph_writes in inklings.db |
Error handling
Section titled “Error handling”| Failure | Behavior |
|---|---|
| Model call fails inside a node | Node raises; LangGraph records the failure; the run surfaces the error via the stream and can be retried from the last checkpoint |
| MCP tool call fails | ToolResult carries the error; planner decides whether to retry, re-plan, or surface it |
Submit-boundary validation refuses a WorldWrite | Tool returns an error result; planner treats it as a domain error, not a runtime crash |
Submit-boundary produces a DeviationRecord | Tool result carries the deviation id; the write may or may not have applied depending on type (see deviation-records) |
| Capability denied | Tool returns a structured capability error; no deviation record is produced per domain rule 7 |
| Sidecar crashes mid-run | On restart, the run is resumed from the last checkpoint for its thread_id; any in-flight node re-executes from its last committed state |
| Interrupt times out | Graph remains in the interrupted state; the user can resume whenever — Command(resume=...) is the only way out |
Related
Section titled “Related”- systems/world/world-agent — the conceptual model of the agent this session drives
- systems/world/submit-boundary — the domain invariant every write obeys
- systems/world/deviation-records — what happens when a submission conflicts with canonical content
- systems/agent/agent-memory-system — four-tier memory and the tools that read and write it
- systems/agent/mcp-system — how Rust tools reach the sidecar
- systems/agent/process-model — sidecar as the single Python host; sandbox as a distinct capability
- systems/agent/scheduling-system — how scheduled runs enter this same flow
- architecture/data-flow/write-path — what happens in Rust after a boundary-crossing tool call
- architecture/database-schema — checkpointer tables,
agent_memory, channels, conversations - architecture/domain-rules — invariants every tool-driven write satisfies
Was this page helpful?
Thanks for your feedback!