Agent Memory System
Crates: crates/domain/src/memory.rs, crates/application/src/memory/
Overview
Section titled “Overview”Agent memory in Inklings follows two core principles from the Recursive Language Models (RLM) paradigm:
-
Memory as explorable data, not consumed tokens — Knowledge lives in queryable external stores rather than in the LLM context window. The agent retrieves what it needs on demand.
-
The workspace is the knowledge base — The PKM workspace itself (pages, blocks, tags, references, CRDT history) is the richest knowledge source. Agent memory supplements but never duplicates it.
Key decisions: 4-tier memory hierarchy, source provenance (replacing ownership), per-tier configurable decay rates, Orient/Work/Persist lifecycle, and best-effort background processing.
Memory Hierarchy (4 Tiers)
Section titled “Memory Hierarchy (4 Tiers)”+-----------------------------------------+| Account (cross-workspace) || + Workspace (project-scoped) || + Channel (topical) || + Conversation (ephemeral) |+-----------------------------------------+- Conversation — Ephemeral working memory. Cleared when the conversation ends. Contains scratchpad entries, intermediate results from Researchers, and task-specific context.
- Channel — Topical grouping that persists across conversations within a workspace. Groups related conversations by topic. Decays faster than workspace memories.
- Workspace — Project-scoped facts and knowledge. Contains facts about workspace content, entity knowledge, relationship observations, and content-derived embeddings.
- Account — Cross-workspace preferences and patterns. Contains only behavioral preferences and procedural learnings — never workspace content or embeddings.
Memory Entry Structure
Section titled “Memory Entry Structure”Each memory entry contains:
| Field | Type | Description |
|---|---|---|
id | UUID | Unique identifier |
scope | MemoryScope | Conversation, Channel, Workspace, or Account |
lifetime | MemoryLifetime | LongTerm, ShortTerm, or Conversation |
source | String | Producer provenance (e.g., “system”, “orchestrator”, “researcher”, “user”, “consolidation”) |
content | String | The memory content (non-empty) |
importance | f64 | LLM-assigned at creation time, in [0.0, 1.0] |
access_count | u64 | Number of times retrieved |
created_at | DateTime | Creation timestamp |
accessed_at | DateTime | Last access timestamp |
tags | Vec<String> | Optional categorization tags |
embedding | Option<Vec<f32>> | Embedding vector for semantic search |
source_conversation_id | Option<UUID> | Conversation that produced this memory |
channel_id | Option<UUID> | Channel this memory belongs to (for channel-scoped memories) |
Source provenance replaces the former ownership concept. The source field is a string recording which agent type
or actor produced the memory (system, orchestrator, researcher, user, consolidation, etc.). This is informational only —
not used for filtering or access control. Added in schema migration V004.
Validation: importance must be in [0.0, 1.0], content must be non-empty.
Decay Model
Section titled “Decay Model”Each tier has its own configurable decay rate via DecayConfig, replacing the former hardcoded 0.995. The
DecayCalculator applies tier-aware decay using these rates.
| Tier | Behavior | Default Decay Rate | 7-day Retention |
|---|---|---|---|
| Conversation | No decay (ephemeral) | N/A | N/A |
| Channel | Fast decay | 0.990/hr | ~17% |
| Workspace | Moderate decay | 0.995/hr | ~43% |
| Account | Slow decay | 0.998/hr | ~71% |
DecayConfig defaults (built-in):
- Account: 0.999
- Workspace: 0.998
- Channel: 0.995
- Conversation: 0.99
Formula: relevance = importance * decay_rate^hours * (1 + ln(1 + access_count))
Three factors:
- Base importance — LLM-assigned at creation time (0.0-1.0).
- Time decay — Exponential decay with per-tier rate from
DecayConfig. Memories accessed less recently decay faster. - Frequency boost — Logarithmic boost from access count. Memories retrieved often resist decay.
Channels
Section titled “Channels”Channels are a topical scoping layer — virtual rooms that group conversations by topic within a workspace.
- Defined as
Channelentity incrates/domain/src/channel/mod.rswith workspace-scoped identity. - Fields:
id,workspace_id,name,description,is_default,created_at. - Default workspace channel (
is_default = true) exists for every workspace (analogous to#general). - One channel has many conversations; a conversation belongs to one channel.
- Schema:
channelstable in workspace-levelagents.db(V004).workspace_idandis_defaultcolumns added in V007. - Channel-scoped memory retrieval:
search_text()andsearch_embedding()acceptchannel_id: Option<Uuid>parameter to filter channel-tier results to the active channel. - System-managed initially; user-managed channel creation is available for organizing conversations by topic.
Conversation Entity
Section titled “Conversation Entity”Conversation is a first-class domain entity (crates/domain/src/conversation/mod.rs) that tracks a single agent
interaction session within a channel.
Fields:
| Field | Type | Description |
|---|---|---|
id | UUID | Unique identifier |
ref_code | RefCode (11-char) | Stable, URL-safe base62 reference code |
channel_id | UUID | The channel this conversation belongs to |
started_at | DateTime | When the agent session started |
ended_at | Option<DateTime> | Set when transitioning from Active to Idle |
status | ConversationStatus | Current lifecycle status |
Lifecycle:
Active ──→ Idle ──→ Archived │ ↑ └────────────────────┘ (force archive)- Active — Agent is currently working.
ended_atis null. - Idle — Agent session ended normally.
ended_atis set. Available for review. - Archived — User manually archived. Excluded from default lists.
State transitions are validated by Conversation::idle() and Conversation::archive(). Both return
ConversationTransitionError for invalid transitions (e.g., idling an already-idle conversation, archiving an already-archived one).
Tauri commands: start_conversation, list_conversations, get_conversation, idle_conversation, archive_conversation.
Storage Architecture
Section titled “Storage Architecture”Physical storage uses agents.db at both account and workspace level (dual-database architecture):
{tauri_data_dir}/+-- agents.db # Account-level: account-tier memories only|+-- workspaces/ +-- {workspace}/ +-- inklings.db # Workspace data (existing) +-- agents.db # Workspace-level: conversation, channel, # workspace memories + channel/conversation tablesSchema (V004)
Section titled “Schema (V004)”The memories table includes:
id,scope,lifetime,source,content,importanceaccess_count,created_at,accessed_attags(JSON array),embedding(f32 BLOB)source_conversation_id— conversation that produced this memorychannel_id— channel scope for channel-tier memories
Supporting tables:
channels— channel definitions with workspace scopingconversations— conversation records with channel assignment
Migration V004 renamed the ownership column to source and added channel_id and source_conversation_id
columns to the memories table, plus channels and conversations tables.
Physical isolation prevents cross-workspace memory leakage. Scope isolation is enforced by database separation, not row-level filtering.
Embedding storage: Raw f32 little-endian BLOB (768 dimensions = 3,072 bytes per embedding).
Workspace deletion cleanly removes all workspace-scoped agent memory without orphan cleanup.
Scratchpad
Section titled “Scratchpad”The scratchpad is agent-private working state — a per-session key-value store for structured temporary data, coupled to
a Conversation entity.
- NOT sent verbatim with LLM calls — it is working memory the agent reads selectively (personal notepad metaphor).
- NOT a context-window extension — it is structured key-value storage, not an overflow buffer.
- Use cases: tracking partial results, accumulating findings across tool calls, maintaining task state.
- Stores intermediate results from Researchers for the Orchestrator to query.
Lifecycle coupling with Conversation:
- Scratchpad entries are keyed by
conversation_id(UUID). - Entries are preserved on archive — archiving a conversation does not delete its scratchpad entries. This preserves the working trail from an interaction for later review.
- Entries are cleared by calling
clear_scratchpad(theClearScratchpadUseCase) explicitly, typically at session start or when the agent determines the slate should be clean.
Schema: scratchpads table in workspace-level agents.db (V001). Columns: conversation_id, key, value
(JSON), updated_at, created_at. Primary key is (conversation_id, key).
Tauri commands: write_scratchpad, read_scratchpad, list_scratchpad, clear_scratchpad.
Memory UI: Scratchpad entries appear in the Conversation scope of the Memory view alongside memory entries. They display with a “Working Note” badge to distinguish them from learned memories. Entries are read-only in the UI (the agent manages them).
Retrieval Pipeline
Section titled “Retrieval Pipeline”Memory retrieval follows a hierarchical widening scope pattern:
- Conversation memories (current session context)
- Channel memories (topical context)
- Workspace memories (project knowledge)
- Account memories (cross-workspace preferences)
Search: Hybrid FTS5 text + embedding similarity search, merged via Reciprocal Rank Fusion (RRF, k=60).
Channel-scoped retrieval: When a channel_id is provided, both search_text() and search_embedding() filter
channel-tier results to the active channel. This enables topical isolation — memories from unrelated channels do not
pollute retrieval results.
Per-tier decay scoring is applied during retrieval to rank results by current relevance.
Caution: Hierarchical weighting + per-tier decay can compound (double-discount effect). Implementation must balance these two scoring dimensions carefully.
Dedup: Two-layer deduplication strategy:
- RRF score threshold — Embedding cosine similarity > 0.9 for deduplication during RRF merge.
- Text-similarity fallback — When RRF scores are below the 0.9 threshold (e.g., when embedding provider is unavailable), exact text matching catches duplicates that would otherwise be missed. This ensures dedup works in FTS-only mode (~0.122 max cosine in FTS-only mode).
Embedding Backfill
Section titled “Embedding Backfill”The EmbeddingBackfillTask is a ScheduledTask that backfills missing embedding vectors for memory entries.
- Schedule: Runs every 30 minutes with a 60-second initial delay.
- Batch size: Processes up to 50 memories per run.
- Graceful degradation: Skips execution when no embedding provider is available (e.g., embedding model not yet downloaded). No errors logged — the task simply returns early.
- Scope: Finds memories where
embedding IS NULLand generates embeddings using the workspace embedding provider.
Session Lifecycle: Orient -> Work -> Persist
Section titled “Session Lifecycle: Orient -> Work -> Persist”The memory system integrates with the agent session through a three-phase lifecycle:
Orient
Section titled “Orient”At conversation start, the Context Pipeline (deterministic infrastructure, not an agent) loads relevant memories and
produces an OrientationDocument — structured markdown injected into the system prompt. This runs in parallel with
skill classification to minimize latency. The pipeline selects context (~100ms, 0 tokens); a Refinement Gate (single
Cheap LLM call, ~500ms) accepts or refines the selection.
During the conversation, the Orchestrator extracts observations as structured secondary output and stores them via the memory use cases. Importance scoring and scope assignment happen at write time.
Persist
Section titled “Persist”When the conversation ends, mechanical extraction reviews the conversation summary and extracts/refines final observations. This serves as a safety net to capture anything missed during the Work phase. Background consolidation runs as a best-effort scheduled task.
Consolidation
Section titled “Consolidation”Consolidation is a scheduled background task with best-effort catch-up — missed schedules are deduped and run on next startup.
Pipeline:
- Score — Calculate relevance for all short-term memories using the decay formula
- Promote — Move short-term memories to long-term if relevance > 0.7 and access_count > 3
- Prune — Remove memories with relevance below 0.01
- Dedup — Merge memories with embedding cosine similarity > 0.95
- Cap enforcement — Enforce 10,000 memories per scope
Memory Management UI
Section titled “Memory Management UI”The memory management interface is a full-page view (not a sidebar panel) due to content density:
- Browse, search, edit, create, and delete memories
- Source filtering — filter by which agent type produced each memory
- Typed Specta bindings (not raw
invoke()calls)
Import/Export
Section titled “Import/Export”- JSON format with schema version for portability
- Supports backup and migration between installations
Implementation Status
Section titled “Implementation Status”| Component | Status | Notes |
|---|---|---|
| Domain entities | Implemented | Source provenance, 4-tier scope, DecayConfig, DecayCalculator, Conversation, Channel |
| Application use cases | Implemented | 8 memory use cases + search router + consolidation task + 5 conversation use cases |
| SQLite repository | Implemented | V007 schema — ref_code on conversations, workspace_id/is_default on channels |
| Scratchpad commands | Implemented | write_scratchpad, read_scratchpad, list_scratchpad, clear_scratchpad (typed Specta) |
| Conversation commands | Implemented | start_conversation, list_conversations, get_conversation, idle_conversation, archive_conversation |
| Embedding backfill | Implemented | EmbeddingBackfillTask as ScheduledTask (30-min, batch of 50) |
| ContextExternalizer | Exists | Orient + Persist logic for agent-core integration |
| React UI | Implemented | Memory view with edit support; scratchpad entries merged in Conversation scope with Working Note badge |
| Agent-core integration | Scaffolded | Integration surface defined via memory service traits |
Related Documents
Section titled “Related Documents”Was this page helpful?
Thanks for your feedback!