Skip to content
Documentation GitHub
Agent

Agent Memory System

Status: Design landing Reference epics: INK-834, INK-842, INK-845, INK-848 ADRs: ADR-016, ADR-018

The World Agent has one memory surface, organized into four tiers. Every tier is reached through MCP tools (see MCP System). There is no virtual filesystem, no ambient AGENTS.md deposited into the run’s working directory, no implicit memory that lives outside the tiers. This page describes the tier model, named authorial memory entries, memory metadata, consolidation, the operator scrub surface, and a design audit through three external lenses.

Memory is partitioned by scope. A memory item lives in exactly one tier; the tier determines who can read it and how long it lives.

  • Conversation. The narrowest tier. Scoped to a single thread_id. Holds the World Agent’s working context for the ongoing exchange: user intents as understood, in-progress commitments, what the agent has already examined, open questions, decisions made during the turn. Cleared (or summarized into a higher tier) when the conversation ends.
  • Channel. A small persistent context shared across conversations that belong together — a project thread, a research thread, a recurring scheduled task. The channel tier is what lets a scheduled task say “where did we leave off last week” without rehydrating the full conversation history. Scoped to a channel identifier; a conversation can belong to zero or one channels.
  • Workspace. Context that is true about the workspace itself — the worldbuilder’s recurring preferences for this workspace, conventions they have asked the World Agent to respect, names of people and projects, style patterns to preserve, the workspace-bound named authorial entries (VOICE and companions). Scoped to a workspace. Shared across all conversations and channels in that workspace.
  • Account. The broadest tier. Context that is true about the worldbuilder as a person, across all their workspaces — cross-workspace persona, disposition, how they like to be addressed, what tone they prefer, what they have explicitly asked the agent to remember personally. Scoped to the account (which may own several workspaces). Carries the account-bound named authorial entries (SOUL and companions).

Tiers do not inherit automatically. A read against one tier returns only items in that tier. The context node in the agent graph (see Agent Core System) reads from all four when composing a turn’s context, and each tier is read explicitly.

Every memory entry carries its own metadata. This metadata is not the world-content provenance axes — those (origin, lifecycle, weight, standing) are ADR-018 world-content axes, not memory-entry fields.

A memory entry’s metadata:

  • Curator — who produced the entry: agent (written during consolidation or by a planner decision), author (written by the worldbuilder directly through the in-app editor or an MCP client), or import (seeded during workspace bootstrap or a migration).
  • Tier — Conversation, Channel, Workspace, or Account. Fixed at write time.
  • Importance — a scalar the planner sets when writing; the retrieval ranker uses it to weight the entry within its tier.
  • Decay / relevance — a time-weighted signal that depresses importance as an entry ages without being read or re-confirmed. Decay rates are tier-specific: conversation entries decay on session close; workspace entries decay slowly; account entries decay very slowly. Decay is a ranking lever, not a hard deletion trigger.

When a memory entry derives from world content — a page, a block, a derivation edge — retrieval surfaces that content’s origin, lifecycle, weight, and standing through a derivation link. The entry holds a reference to the world content; it does not copy the provenance axes onto itself. The provenance travels with the content, not with the memory record.

This is the critical separation: memory is agent-scratch, not workspace content. It carries no zone membership. It does not appear in search alongside pages. Promoting something from memory into the world is an explicit candidate write through the Submit Boundary.

All four tiers live in inklings.db. There is no per-tier database, no filesystem backing, no on-disk JSON bag. A tier is a table family with a scope key (thread, channel, workspace, or account). The LangGraph checkpointer also uses inklings.db, schema-separated from workspace-content tables — agent state travels with the workspace across every sync, export, and migration path, as specified in ADR-016.

The domain owner for the memory tables is the Tauri host. The sidecar reads and writes through MCP memory tools; Rust handlers execute the SQL. This is the same constraint as every other workspace-adjacent table: the host owns the database.

Memory is exposed to the World Agent exclusively as MCP tools. A small family:

  • memory.read(tier, scope, query) — retrieves items relevant to query within a tier’s scope.
  • memory.put(tier, scope, item) — appends an item to a tier with its curator and importance.
  • memory.update(tier, scope, id, patch) — revises an existing item (used primarily by consolidation).
  • memory.forget(tier, scope, id) — marks an item no longer active.

Reads are typically retrieval-ranked (vector-plus-keyword) rather than full scans — the tiers can be long-lived for higher scopes. Writes are append-mostly; updates and forgets exist but are less common.

Scopes on tool calls are constrained by the caller’s identity. The workspace tier is reachable only from within that workspace. The account tier is reachable only from the workspace’s World Agent on behalf of that account’s worldbuilder. An external MCP client sees only the tiers its scope grants.

The memory tools are ordinary MCP tools with ordinary registration. The Submit Boundary does not apply — memory items are agent-scratch, not workspace content.

Named authorial memory entries are a small, stable set of durable entries — VOICE, SOUL, and companions the worldbuilder creates — that live in the four-tier model as first-class editable pages inside the app. They are the in-app replacement for any AGENTS.md-style durable-context file.

A named entry is a NamedMemoryEntry on the existing memory tiers:

  • stable name — a short identifier (VOICE, SOUL, a worldbuilder-chosen name for additional entries).
  • tier — Workspace for world-bound entries; Account for cross-workspace persona entries.
  • markdown body — plain prose, no required structure.
  • last-edited timestamp — set on each author edit; surfaced in the editor.
  • sync semantics — inherited from the parent tier. Workspace entries sync with the workspace; Account entries sync with the account. No special-casing.

Two named entries ship with every workspace:

  • VOICE (Workspace tier) — the worldbuilder’s voice and tone for this workspace. Intended to hold: recurring style guidance, how the World Agent should frame its prose when producing candidate content, what register is appropriate for this world. Bound to the workspace because a worldbuilder’s fantasy-novel workspace and their technical-documentation workspace may carry different voices.
  • SOUL (Account tier) — the World Agent’s personality and disposition as the worldbuilder has shaped it, carrying across all workspaces. Intended to hold: general demeanor, what the worldbuilder finds helpful vs. grating, working agreements that should follow the agent everywhere.

Account-tier defaults beyond SOUL are an open question.

Worldbuilders may create additional named entries within the same shape. There is no fixed taxonomy beyond the shipped defaults. A worldbuilder who wants a WORLDBUILDING_PRINCIPLES entry at the workspace tier, or a RESEARCH_STYLE entry at the account tier, creates it in the named-entries editor.

  • Not world content. Named entries do not duplicate or replace world knowledge. The world is the authoritative source for world knowledge — character facts, lore, structural decisions. Named entries capture authorial standing context (voice, disposition, working agreements), not content the agent should reproduce.
  • Not an AGENTS.md file. There is no filesystem; the in-app editor is the only authoring surface. Nothing is written to a well-known file path at the start of a run.
  • Not a note-taking surface. Named entries are durable, purposive, agent-context entries — not a general-purpose scratch space.

The context node reads all named entries as part of the standing memory slice on each run, through the existing MCP memory tools. Named entries are loaded before the retrieval phase that populates the variable portion of the context window. They are always present, not subject to the importance/decay ranking that governs other memory entries, and not displaced by a dense retrieval result.

This is the operational analog of a system-prompt file, except: (a) the worldbuilder authors them in the app, not in a config file; (b) they travel with the workspace via standard sync; (c) they are represented as typed memory entries on a tier, not as special-cased files the runtime reads at a fixed path.

The World Agent does not see a filesystem. There is no AGENTS.md file deposited at the start of a turn. There is no “working directory” in which the agent’s scratch accumulates across turns.

Concretely:

  • Nothing in the agent graph reads from or writes to AGENTS.md or any file at a well-known path.
  • The scratchpad described in Agent Core System is a LangGraph state channel, not a file.
  • Persistence across turns is either in the LangGraph checkpointer (which recovers the turn) or in one of the four memory tiers (which carries meaning forward).

This is deliberate. A VFS-shaped memory makes every agent call implicitly stateful in ways that do not survive a sidecar restart cleanly and do not compose across subagent boundaries. Tiered memory with explicit reads is the opposite shape: nothing is ambient, nothing is implicit, every memory touch is visible and curator-stamped.

Memory grows. Without pruning, the channel and workspace tiers would drift toward holding every micro-observation the World Agent ever formed. Consolidation is how that is prevented.

Consolidation is not a background thread in the sidecar. It is not a hook on every turn. It is a scheduled task, run by the Scheduling System against the Task Runner System:

  • A consolidation task runs on each tier on its own cadence (workspace and account infrequently; channel more often; conversation at session close).
  • The task is itself an agent run on a thread — same graph, same tool surface. It reads the tier, produces a consolidated version, writes back through the same memory tools, and forgets items it has rolled up.
  • Consolidation rewrites; it does not append. Older entries are replaced by consolidated forms — they are not accreted alongside them. When consolidation completes, the items it processed are marked forgotten, and the consolidated entry is the new record for that body of knowledge.
  • Consolidation is interruptible and resumable like any other run. A partially-consolidated tier is still a coherent tier.

Because consolidation runs on the same infrastructure as any other scheduled work, there is no special “memory maintenance” machinery. The scheduler fires, the World Agent reads and writes, the task completes.

Write triggers per tier:

  • Conversation — planner node explicit write during a turn; summary write by the persist node at turn end.
  • Channel — consolidation task on cadence; planner write when the worldbuilder states a preference that belongs to a project or thread context.
  • Workspace — planner write when the worldbuilder states a recurring workspace preference; worldbuilder direct edit of named entries.
  • Account — planner write when the worldbuilder states a cross-workspace preference; worldbuilder direct edit of SOUL or other account-tier named entries.

There is no implicit “learn from every turn” pass. The World Agent writes memory when a planner decision tells it to, and only then.

The scratchpad is the graph’s working memory for a single turn. It lives in the LangGraph state channel and is checkpointed with the turn.

The scratchpad is not a fifth tier. It does not survive the turn’s end, and it is not reachable across runs. Work the agent wants to carry forward must be promoted into the conversation tier (or higher) explicitly. Nothing about the scratchpad is persisted as memory unless a planner node decides to write it.

This keeps the tier model clean: the four tiers are all persistent, all scoped, all reached through memory tools. The scratchpad is ephemeral and graph-local.

When a turn starts, the context node reads from all four tiers into the graph’s state channels. The reads are:

  • Conversation: items for this thread_id.
  • Channel: items for the thread’s channel, if any.
  • Workspace: items for the workspace, including all named entries at that tier.
  • Account: items for the account, including all named entries at that tier.

Named entries are loaded first, as part of the standing memory slice, before the ranked retrieval phase. Each ranked read produces a slice sized to fit within the turn’s context budget. The ranking function is tier-specific: conversation ranks by recency; workspace and account rank by relevance to the current query weighted by importance and decay. The context node is the only place all four are read together.

Subagent subgraphs (see Agent Core System) decide per-subagent what they read. A small-scope subagent might read only conversation; a long-horizon subagent might read all four.

Writes to memory are explicit tool calls from planner or persist nodes. Typical patterns:

  • Worldbuilder states a recurring preference — planner writes to the workspace tier.
  • Worldbuilder corrects something stated previously — planner forgets the stale item in the channel tier and writes the replacement.
  • World Agent forms a working hypothesis during a long turn — planner writes to the conversation tier.
  • Consolidation task rolls up a channel tier — writes consolidated items, forgets the originals.
  • Worldbuilder edits VOICE directly in the app — writes to the workspace-tier named entry.

When an external MCP client (Claude Desktop, Cursor) connects to the workspace (see MCP System), its access to memory depends on declared scopes:

  • Workspace tier — readable if the client holds workspace read scope; writable only if the worldbuilder has explicitly granted.
  • Account tier — not exposed to external clients by default; worldbuilder can grant read for their own clients.
  • Channel and conversation — only reachable if the external client is participating in that channel or thread.

External writes to memory, like any other external writes, go through the same handlers. There is no parallel memory path for external clients.

Memory hygiene and the operator scrub surface

Section titled “Memory hygiene and the operator scrub surface”

The four-tier model has clean write paths. It does not yet have clean reset paths. Experimentation, broken sessions, abandoned threads, and superseded decisions accumulate noise that the worldbuilder cannot remove without targeted tooling. The scrub surface fills this gap.

The scrub surface is operator-grade. It is discoverable from the World Agent control plane, not from primary worldbuilder surfaces. It does not appear in conversational flow. It is a power-user reset surface, not a routine feature.

Each tier has an independent “clear contents” path:

  • Conversation-tier scrub — takes a thread_id selector; clears memory entries for the specified thread.
  • Channel-tier scrub — takes a channel selector; clears memory entries for the specified channel.
  • Workspace-tier scrub — clears all memory entries for the workspace (subject to retention mode below).
  • Account-tier scrub — clears all memory entries for the account (subject to retention mode below).

No cross-workspace scrub. Each operation targets a single workspace (or, for the account tier, the current account context).

A companion operation: per-thread purge of LangGraph checkpointer state and event-log entries linked to the thread. Both the compact-form and expanded-trace records produced by checkpoint-rewind compaction are removed. This is destructive by definition; the dry-run gate (below) is the safety mechanism.

Every scrub operation runs in one of two modes:

  • full — clear everything in scope.
  • keep_named — preserve all named authorial memory entries (VOICE, SOUL, and worldbuilder-created companions); clear the rest.

A keep_canonical mode was considered and dropped. Agent-memory entries do not carry world-content lifecycle axes (origin, lifecycle, weight, standing are world-content axes per ADR-018, not memory metadata). Scrubbing workspace content by lifecycle is outside the scope of memory hygiene — the existing per-page lifecycle paths apply to that. A finer scrub, were one ever needed, would key on tier or curator — not on a lifecycle axis that memory entries do not have.

Every scrub operation defaults to dry-run. In dry-run mode, the operation returns a structured manifest of what would be removed — counts, scope summary, and a capped sample of entry identifiers. Destructive execution requires an explicit --execute flag (or the UI equivalent) plus a confirmation step.

The confirmation step UX (two-step click, typed phrase, credential re-prompt) is a refinement-open question.

Every executed scrub produces an event-log entry containing: scope, mode, manifest summary, operator identity, timestamp. Scrub events are terminal log entries — they cannot themselves be scrubbed.

The scrub paths are exposed as:

  • Tauri commands — for the operator UI in the World Agent control plane.
  • MCP tools — capability-gated, for scripted operator workflows and integration testing.

The specific capability assignments for each scrub mode are a refinement-open question. Each mode likely warrants a distinct capability to allow fine-grained grants.

  • No filesystem-level destructive operations. Attachments, derivation-link source files, and workspace content outside the agent-state surface are unaffected. Workspace content deletion uses the existing per-page lifecycle path.
  • No automatic or scheduled scrub. Tier consolidation (the Scheduling System) is the automatic rewrite path — it rewrites, it does not destroy. Operator-initiated scrub is the only destructive path.
  • No undo. The dry-run + confirmation gate is the safety mechanism, not a post-hoc restore.
  • Not workspace content. Memory is not pages, not blocks, not tags. It carries no zone membership. It does not appear in search (Search System) alongside pages.
  • Not a substitute for candidate writes. If the World Agent has learned something worth the worldbuilder seeing, it proposes a candidate write through the Submit Boundary. Hiding something in memory is not the same as writing it; the worldbuilder cannot find it in the corpus.
  • Not shared across workspaces. The workspace tier is strictly scoped. Two workspaces owned by the same account share only what the account tier carries.
  • Not a cache. Memory is meant. It is things the World Agent has remembered on purpose, not things that happened to pass through. Cross-provider prompt caching is described in LLM System.

This section documents the memory system through three external lenses: Bader’s 9-axis framework, a 10-failure-mode self-audit, and Wooders’ 7-question harness-integration checklist. The vocabulary — axes, failure modes, integration questions — comes from external frameworks. The answers are our own.

The audit is a durable diagnostic surface. A builder a year from now can use it to evaluate whether the current memory choices still hold and whether any of the explicit rejections should be revisited.

Axis 1 — What gets stored (raw / derived / mix)

We store derived memory only. If a fact is reproducible from current world state — code structure, page content, search results, derivation-graph membership — it is not persisted in memory. Only observations and inferences that are not derivable from the corpus at the moment a query is issued are candidates for memory storage.

The rule: no derivable storage. The rationale: the corpus is always live, always queryable through MCP content tools. Caching derivable facts in memory creates a stale-content risk without a recovery path. The agent reads the current state and reasons from it; memory holds only what the current state cannot provide.

Axis 2 — When derivation happens (write-time / read-time / background / scheduled)

Derivation from raw observations into memory entries happens at write time (the planner node decides what to write and produces the derived form) or at consolidation time (the scheduled task rewrites tier content into consolidated forms). There is no read-time derivation — the agent does not synthesize new memory entries during the read phase of a turn. There is no time-based background derivation thread.

Axis 3 — What triggers a write (per-tier)

See the write-trigger table in the consolidation section above. The principle: writes are caused by planner decisions and consolidation tasks, not by ambient observation. There is no implicit “learn from every turn” pass. A turn that has no planner node decision to write memory writes nothing to memory.

Axis 4 — Where it gets stored (storage backends)

All four tiers live in inklings.db under schema-separated tables. The LangGraph checkpointer also uses inklings.db. There is no per-tier database, no filesystem backing, no external store. Agent state and memory travel with the workspace via standard sync, migration, and export paths. This follows Chase’s “your harness, your memory” framing: memory ownership tracks harness ownership — LangGraph as substrate, our harness on top, memory in our SQLite.

Axis 5 — How it gets retrieved (retrieval strategies)

Reads are retrieval-ranked. The ranking combines vector similarity (via the embedding layer, see Embedding System) and keyword match (FTS5). Named authorial entries bypass ranking — they are always loaded as part of the standing memory slice.

Axis 6 — Post-retrieval processing (re-ranking, RRF)

Retrieved slices are re-ranked using RRF (Reciprocal Rank Fusion) merging vector and keyword rankings. Named entries are inserted ahead of the ranked set. The context node assembles the final memory window from the merged slice sized to the turn’s context budget.

Axis 7 — When retrieval happens (per-tier)

Retrieval happens at the start of each turn, in the context node, before any planner or tool-execution step. Named entries load first. Ranked retrieval for each tier follows. Subagent subgraphs retrieve per-subagent scope; a narrow subagent may skip the workspace and account tiers.

Axis 8 — Who is doing the curating (per-write)

The curator field on every memory entry records who produced it: agent, author, or import. Agent-curated entries come from planner decisions and consolidation tasks. Author-curated entries come from the worldbuilder editing named entries directly in the app or through an authorized MCP client. Import-curated entries are seeded during workspace bootstrap or migration. Curator is surfaced at retrieval alongside the entry’s tier and importance.

Axis 9 — Forgetting policy (decay, consolidation, cascade)

Three levers:

  1. Decay — time-weighted signal that depresses importance as an entry ages without being read. Decay rates are tier-specific. Decay is a ranking lever, not a deletion trigger; entries are not automatically removed when they decay.
  2. Consolidation — the scheduled task that rewrites a tier. Consolidation replaces entries with consolidated forms; it does not append. Items processed by consolidation are marked forgotten; only the consolidated entry survives.
  3. Explicit forget — the memory.forget tool call from a planner node when the agent decides an entry is superseded.

There is no wall-clock deletion trigger. There is no hard index truncation. The agent’s forgetting policy uses importance, decay, and lifecycle state as levers, not hard size bounds. This is the explicit rejection of Claude Code’s file-backed index truncation pattern: truncation is a workaround for systems that lack decay and lifecycle; we have both.


Failure-mode self-audit (Bader’s 10 failure modes)

Section titled “Failure-mode self-audit (Bader’s 10 failure modes)”

1. Session amnesia — the agent forgets what happened in a previous session and re-asks or re-derives the same things.

How it would manifest: the World Agent asks the worldbuilder to repeat context from a conversation held yesterday in the same channel. What prevents it: the channel tier persists across sessions; the conversation-tier summary written at session close promotes the key decisions into the channel tier. Residual risk: if the persist node fails to write a summary, context is not promoted. Consolidation at session close is the mitigation; a failed consolidation leaves the raw conversation-tier entries available until the next consolidation pass.

2. Entity confusion — the agent conflates two similar entities because their memory footprints have merged.

How it would manifest: VOICE or a workspace-tier entry for Character A bleeds into retrieval for Character B. What prevents it: memory entries are short, curator-stamped, and retrieved by semantic similarity against a specific query; retrieval does not aggregate across unrelated entries. Named entries are per-workspace, not per-entity. Residual risk: if the worldbuilder writes a workspace-tier entry that mentions two characters without distinguishing them, confusion is possible. The mitigation is authorial discipline in named-entry body text, not a systemic fix.

3. Over-inference — the agent adds conclusions to memory that the source material did not warrant.

How it would manifest: a planner node writes “Character X is a villain” to the workspace tier when the worldbuilder’s pages only imply moral ambiguity. What prevents it: the no-derivable-storage rule means the agent should not write facts that can be read from current content; if the conclusion is inferrable from current pages, it should not be in memory at all — the agent reads the pages. Residual risk: inference that is not clearly derivable from current content may still be written at agent discretion. The curator field traces this to agent; the worldbuilder can inspect and delete.

4. Derivation drift — memory entries that once derived from world content are now inconsistent with the content because the content changed and the entry was not updated.

How it would manifest: a workspace-tier entry says “the story is set in the 14th century” but the worldbuilder has since retconned the setting. What prevents it: the no-derivable-storage rule means facts that can be read from current content are not persisted; if the setting is in a canonical page, the agent reads the page, not the memory entry. For entries that derive from world content, the derivation link surfaces the source content’s current origin, lifecycle, weight, and standing — the agent sees the derivation and knows to check the source. Residual risk: entries that were authored without a derivation link can drift. Consolidation should replace them; the worldbuilder can manually edit named entries.

5. Retrieval misfire — the agent retrieves an entry that is superficially similar to the query but semantically wrong, and acts on it.

How it would manifest: a query about “Chapter 3 pacing” retrieves a workspace-tier entry about “pace of the magic system’s reveal” — different meaning, same surface words. What prevents it: the skeptical retrieval posture (see below) frames all memory hits as candidates, not facts. The agent verifies before acting on a hit. Residual risk: in a long turn with many tool calls, verification discipline may erode. The context node’s ranked slice is not the full tier — only the top-N items for the current query pass through; entries that are semantically distant enough will not appear.

6. Stale context dominance — old workspace-tier entries dominate retrieval, crowding out more recent and relevant material.

How it would manifest: an account-tier entry written six months ago about “write in a spare, minimalist style” dominates every retrieve even though the worldbuilder has since shifted to more elaborate prose and written a new VOICE entry reflecting that. What prevents it: decay reduces the importance of entries that are not refreshed. The worldbuilder editing VOICE is a re-confirmation that resets importance. Named entries are loaded as a standing slice separately from the ranked retrieval — but they are also the easiest surface for the worldbuilder to update. Residual risk: entries the worldbuilder does not know to update continue to exert influence. Consolidation and decay are the automated mitigations.

7. Selective retrieval bias — the retrieval function systematically under-retrieves entries that are semantically distant from the query’s surface form, even when they are highly relevant.

How it would manifest: a complex cross-topic query retrieves narrow hits because the embedding space does not capture the query’s full intent. What prevents it: RRF merging vector and keyword rankings mitigates embedding-only blind spots. Named entries bypass ranking and are always present. Residual risk: genuine semantic distance between entry and query is a real retrieval limit. The agent is expected to issue follow-up reads when the initial slice is insufficient.

8. Compaction information loss — consolidation discards nuance that was present in the original entries.

How it would manifest: a consolidation run over ten conversation-tier entries about a subplot produces a single summary entry that loses the chronological arc the worldbuilder cared about. What prevents it: the checkpoint-rewind compaction system produces compact-form records alongside expanded-trace records; the expanded trace is recoverable from the event log. Consolidation is an agent run — the consolidation prompt is tunable, and consolidation quality is a function of that prompt. Residual risk: loss in the consolidated form is real. The expanded trace in the event log is the recovery path. The worldbuilder can inspect and manually correct named entries.

9. Confidence without provenance — the agent cites a memory hit as if it were a fact, without surfacing that the information came from memory and may be stale or inferred.

How it would manifest: “Your story is a tragedy” asserted as fact because a workspace-tier entry says so, without the agent indicating this is a prior-stated preference, not current canonical content. What prevents it: skeptical retrieval is an enforced posture. Every memory hit presented to the worldbuilder is framed as “you have noted previously” or “a prior preference recorded as: …” rather than as a flat assertion. For entries that derive from world content, the retrieval response surfaces that content’s origin, lifecycle, weight, and standing via the derivation link — the worldbuilder and the agent both see the epistemic status of the source. The entry references provenance; it does not carry it directly. Prompt assembly enforces this framing in the context node output. Residual risk: prompt-level enforcement relies on the agent honoring the framing convention. The curator field and derivation link are structural signals; the skeptical framing is a prompt engineering choice.

10. Memory-induced bias — the existence of certain memory entries shifts the World Agent’s behavior in ways the worldbuilder did not intend.

How it would manifest: a stale account-tier entry (“the worldbuilder prefers short answers”) causes the agent to truncate detailed responses even when detail is clearly warranted by the current question. What prevents it: the worldbuilder has direct visibility into named entries (VOICE, SOUL) and can edit them at any time. The curator field distinguishes agent-written entries from worldbuilder-authored ones — the worldbuilder can audit which preferences the agent added without explicit instruction. The scrub surface allows bulk removal if needed. Residual risk: entries the worldbuilder does not know exist continue to exert influence. The Memory Tier Overview in the World Agent control panel (see World Agent) surfaces a summary of each tier and is the worldbuilder’s primary inspection point.


Integration-question checklist (Wooders’ 7 questions)

Section titled “Integration-question checklist (Wooders’ 7 questions)”

1. How are named authorial memory entries (VOICE, SOUL) loaded into context?

Named entries are loaded by the context node as the standing memory slice on every run, before ranked retrieval. The context node calls the MCP memory tools with a filter for entry_type: named; all named entries in the Workspace and Account tiers are returned regardless of the query. They are inserted into the context window ahead of the ranked retrieval results. This guarantees VOICE and SOUL are present on every run without competing in the importance/decay ranking that governs ordinary entries.

2. How is skill metadata shown to the World Agent? (system prompt? tool registry? per-call MCP discovery?)

Skill metadata is exposed via per-call MCP tool discovery. Skills are not baked into the system prompt; they are registered in the MCP tool registry and appear as available tools at the start of each turn. The World Agent issues tool-discovery calls to enumerate skills relevant to the current task. This keeps the system prompt narrow and lets the skill set evolve without prompt changes. See Agent Core System and Skill System.

3. Can the World Agent modify its own system instructions?

No. The World Agent cannot modify the system prompt or the standing context assembled by the context node. It submits candidate writes to workspace content through the submit boundary — that is the only path by which agent output becomes world content. Named entries (VOICE, SOUL) are the closest thing to agent-modifiable instructions, but they are worldbuilder-owned: the worldbuilder edits them through the in-app editor, and the World Agent reads them, not overwrites them. Role overlays are call-scoped constructs that do not persist into memory.

4. What survives compaction?

The checkpoint-rewind compaction system produces compact-form records and expanded-trace records. Both survive in the event log. The compact form is what the context node loads on rehydration; the expanded trace is recoverable for audit or correction. Memory-tier entries that were consolidated survive as the consolidated form; the originals they replaced are marked forgotten. The full event log is the recovery surface for anything the compact form omitted.

5. Are interactions stored and queryable?

Yes. The event log records all workspace mutations. Bookmarks anchor event-log positions for structured recovery. The conversation tier indexes the active thread’s context, and the FTS5 and embedding layers on workspace content are queryable through MCP content tools. There is no separate raw-transcript tier — querying the event log plus the conversation-tier entries is the retrieval path for past interactions.

6. How is memory metadata presented to the World Agent?

At retrieval, each entry surfaces its own metadata in the retrieval response: curator, tier, importance, current decay-weighted relevance score. For entries that derive from world content, the derivation link is also surfaced — and through that link, the source content’s origin, lifecycle, weight, and standing from the Provenance Model are available to the agent. The entry references provenance; it does not carry it as copied fields. This separation is architectural: memory entries are agent-scratch on the memory tiers, not world-content nodes on the provenance graph.

7. How is current working state represented?

Working state for the current turn lives in LangGraph state channels — the scratchpad. Nothing in the working state is exposed as a filesystem path, a working directory, or a file the agent reads from or writes to. State that should survive the turn must be promoted explicitly into a memory tier by a planner node write. The LangGraph checkpointer persists the turn’s state in inklings.db; recovery from interruption resumes from the checkpointed state, not from a reconstructed filesystem.


Skeptical retrieval — every memory hit is a candidate, not a fact.

Memory hits are framed in the World Agent’s prompt as candidates: “you have noted,” “a prior preference recorded as,” “based on an earlier entry.” They carry the entry’s curator, tier, and importance. For entries derived from world content, the derivation link surfaces that content’s origin, lifecycle, weight, and standing — the agent and the worldbuilder both see the epistemic status of the underlying source. The agent must verify before acting on a hit. Prompt assembly enforces this in the context node output. This framing is the primary mitigation for failure mode 9 (confidence without provenance).

No derivable storage — if a fact is reproducible from current world state, it is not persisted in memory.

Code structure, page content, search results, derivation-graph membership — anything that can be freshly read through MCP content tools is not a memory write candidate. The write-policy rule: the planner writes to memory only when the observation or inference is not derivable from the corpus at the time of a query. The rationale: the corpus is live and queryable; caching derivable facts creates staleness risk without a recovery path. This framing is the primary guard on axis 1 (what gets stored) and failure mode 4 (derivation drift).

Continuous-edit consolidation — memory consolidation rewrites; it does not append.

Consolidation replaces older entries with consolidated forms. When the consolidation task completes, the items it processed are marked forgotten and only the consolidated form persists for that body of knowledge. This is not an append of a summary alongside the originals; the originals are retired. This framing pairs with the checkpoint-rewind compaction system for conversation history (compact-form + expanded trace, both in the event log) and with the tier-consolidation task in the Scheduling System for memory tiers.

The audit records four patterns that have been explicitly rejected. They appear here to prevent them from being re-proposed.

No raw-transcript grep-able tier. The system has FTS5, embeddings, RRF, an event log, bookmarks, and a conversation tier. A separate transcript surface for raw grep is a category mistake: the event log plus the conversation tier is the query path for past interactions; adding a raw-transcript tier would duplicate state, complicate compaction, and create a stale-read hazard.

No wall-clock consolidation triggers. Consolidation is event-driven, dispatched by the TaskRunner based on queue depth, retroactive-revision queue size, user idle signal, or explicit operator signal. Wall-clock triggers — “consolidate every 24 hours” — are the correct shape for systems that lack event-driven triggers. We have them; wall-clock triggering adds unnecessary timer machinery.

No filesystem-backed memory files (MEMORY.md, AGENTS.md, .agents/ directories). Rejected in the four-tier memory design. The in-app named-entry surface is the replacement; it is tier-backed, syncs with the workspace, and does not depend on file-system conventions.

No in-line index truncation as the durability mechanism. Hard truncation is a workaround for systems that lack decay rates, importance fields, and lifecycle state. Decay and consolidation are the levers; truncation would discard information that decay and consolidation would have handled gracefully.


  • Agent Core System — the context node reads all four tiers at turn start; planner and persist nodes write through memory tools.
  • MCP System — memory tools are MCP tools with ordinary registration.
  • Scheduling System — consolidation tasks and the scrub-accessible event log run as scheduled work.
  • Conversation System — the conversation tier is keyed by thread_id, the same identity the conversation uses.
  • Submit Boundary — memory is not workspace content and does not cross the boundary; promoting a memory observation to workspace content is a separate candidate write.
  • Provenance — the world-content provenance axes (origin, lifecycle, weight, standing) live here, not on memory entries. Derivation links surface this provenance from world content to memory retrieval.
  • Embedding System — the vector component of retrieval ranking.
  • Search System — workspace content search; memory does not participate in this surface.
  • Does not describe how the context node composes its output or how the planner consumes it. See Agent Core System.
  • Does not describe the workspace content model. See Workspace System and Page System.
  • Does not describe embeddings or retrieval internals. See Embedding System.
  • Does not cover checkpoint-rewind compaction mechanics beyond what is relevant to memory hygiene. See Checkpoint Rewind and Compaction.
  • Does not enumerate capability variants for the scrub MCP tools — those are an open question.

Was this page helpful?