Skill System
Crate: crates/infrastructure/agent-harness/
Overview
Section titled “Overview”Skills are multi-artifact packages, not single-template prompts. A skill packages together multiple artifacts of
different kinds — descriptions, prompt templates, code templates, DSPy modules — that collectively define a reusable
agent capability. The system dispatches to different handlers based on artifact kind, replacing the previous
ExecutionMode enum (FreeForm/Templated/Blueprint).
Design Principles
Section titled “Design Principles”- Skills are prompt content, not compiled code. A new skill is a new set of prompt documents, not a new crate version. Agent capabilities evolve independently of binary releases.
- Artifact-kind dispatch replaces execution modes. Instead of a skill declaring itself as “FreeForm” or “Blueprint”, each artifact declares its kind. The handler is selected per-artifact.
- Two-phase activation. Phase 1 (Context Pipeline) matches skill metadata cheaply via deterministic search. Phase 2 (Orchestrator) selects and loads specific artifacts. Full artifact content is never loaded until needed.
- Dual-scope storage. Skills live in
agents.dbat both account and workspace levels. Workspace skills override account skills by name. - Dual execution contexts.
code_templateartifacts run in the WASM sandbox (CPython-in-Wasmtime).dspy_moduleartifacts are proxied to the Python sidecar (apps/python-sidecar/). - Multi-artifact graceful degradation.
dspy_moduleskills must include aprompt_templatefallback for environments where the sidecar is unavailable. - Template-only DSPy execution.
dspy_moduleartifacts reference known shipped templates by ID and store serialized state (JSON) instate_blob— no user-supplied Python code enters the sidecar.
Skill Package Schema
Section titled “Skill Package Schema”Database Schema
Section titled “Database Schema”Skills are stored across two tables in agents.db:
-- Skill packagesCREATE TABLE skills ( id TEXT PRIMARY KEY, -- UUID name TEXT NOT NULL UNIQUE, -- Human-readable identifier description TEXT NOT NULL, -- Short description for catalog display version TEXT NOT NULL DEFAULT '1.0.0', tags TEXT, -- JSON array of string tags source TEXT NOT NULL DEFAULT 'user', -- 'system' | 'community' | 'user' author TEXT, -- Display name of author created_at TEXT NOT NULL DEFAULT (datetime('now')), updated_at TEXT NOT NULL DEFAULT (datetime('now')),
-- Metadata for pipeline matching (Phase 1) intent_patterns TEXT, -- JSON array of intent pattern strings trigger_phrases TEXT, -- JSON array of trigger phrase strings
-- Sync and distribution marketplace_id TEXT, -- NULL for local-only skills sync_etag TEXT -- For marketplace version checking);
CREATE INDEX idx_skills_source ON skills(source);CREATE INDEX idx_skills_name ON skills(name);
-- Skill artifacts (one-to-many with skills)CREATE TABLE skill_artifacts ( id TEXT PRIMARY KEY, -- UUID skill_id TEXT NOT NULL REFERENCES skills(id) ON DELETE CASCADE, kind TEXT NOT NULL, -- Artifact kind enum (see below) name TEXT NOT NULL, -- Human-readable artifact name ordinal INTEGER NOT NULL DEFAULT 0, -- Display/execution ordering content TEXT NOT NULL, -- The artifact content (template, code, etc.) model_variant TEXT, -- NULL = default; else model family key metadata TEXT, -- JSON: kind-specific config state_blob BLOB, -- Serialized DSPy module state (dspy_module only) created_at TEXT NOT NULL DEFAULT (datetime('now')), updated_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE(skill_id, name, model_variant));
CREATE INDEX idx_skill_artifacts_skill_id ON skill_artifacts(skill_id);CREATE INDEX idx_skill_artifacts_kind ON skill_artifacts(kind);Artifact Kind Enum
Section titled “Artifact Kind Enum”#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]#[serde(snake_case)]pub enum ArtifactKind { /// High-level description of the skill's purpose and approach. Description, /// MiniJinja template rendered with context variables before LLM call. PromptTemplate, /// Python code template executed in the RLM sandbox (WASM). CodeTemplate, /// Structured DSPy module executed in the Python sidecar. DspyModule, // -- Deferred kinds (schema supports, handlers not yet implemented) -- // /// Strategy or methodology guidance (injected into system prompt). // Approach, // /// Concrete input/output examples for few-shot prompting. // Example,}V1 artifact kinds: description, prompt_template, code_template, dspy_module. The approach and example
kinds are supported in the schema but their handlers are deferred.
Example Skill Package
Section titled “Example Skill Package”“Consistency Checker” — Analyzes workspace content for contradictions.
| Artifact | Kind | Purpose |
|---|---|---|
| Overview | description | Explains what the skill does and when to use it |
| Check Prompt | prompt_template | MiniJinja template for per-page-pair contradiction analysis |
| Cross-Reference Script | code_template | Python script that traverses workspace pages, extracts claims, and clusters them for LLM review |
| DSPy Checker | dspy_module | References shipped chain_of_thought template with optimized state for contradiction detection |
{ "id": "550e8400-e29b-41d4-a716-446655440000", "name": "Consistency Checker", "description": "Analyzes workspace content for contradictions and inconsistencies across pages.", "version": "1.0.0", "tags": ["analysis", "worldbuilding", "quality"], "source": "system", "intent_patterns": ["check consistency", "find contradictions", "verify facts"], "trigger_phrases": ["are there any contradictions", "check for inconsistencies"], "artifacts": [ { "kind": "description", "name": "Overview", "ordinal": 0, "content": "The Consistency Checker skill analyzes content across your workspace to identify contradictions, inconsistencies, and conflicting statements. It is most useful for worldbuilding projects with many interrelated entities (characters, locations, timelines)." }, { "kind": "prompt_template", "name": "Check Prompt", "ordinal": 1, "content": "Given the following claims about \"{{ entity_name }}\":\n\n{% for claim in claims %}Claim {{ loop.index }} (from [[{{ claim.page_name }}]]): {{ claim.text }}\n{% endfor %}\n\nIdentify any contradictions between these claims. For each contradiction, cite the specific claims by number and explain the conflict." }, { "kind": "code_template", "name": "Cross-Reference Script", "ordinal": 2, "content": "import inklings\n\npages = inklings.search('{{ entity_type }}')\nclaims = []\nfor page in pages:\n content = inklings.get_page(page['slug'])\n extracted = inklings.llm_query(f'Extract factual claims about {entity_type} entities from: {content}')\n claims.extend(extracted)\ninklings.submit({'entity_type': '{{ entity_type }}', 'claims': claims})" }, { "kind": "dspy_module", "name": "DSPy Checker", "ordinal": 3, "content": "{\"template_id\": \"chain_of_thought\", \"signature\": \"claims -> contradictions\"}", "state_blob": "<base64-encoded JSON from DSPy module.save(path, save_program=False)>", "metadata": { "fallback_artifact": "Check Prompt" } } ]}Artifact Kinds
Section titled “Artifact Kinds”description
Section titled “description”High-level explanation of the skill’s purpose, target audience, and when it should be activated. Loaded during Phase 2 (Orchestrator review) to confirm skill relevance before loading heavier artifacts.
| Property | Value |
|---|---|
| Handler | None (informational; injected into Orchestrator context) |
| Context cost | Low (~100-300 tokens) |
| When used | Always loaded when skill is activated |
prompt_template
Section titled “prompt_template”A MiniJinja template that is rendered with context variables (entity names, page content, user query, etc.) and sent as a user message to the LLM. This is the primary mechanism for structured LLM interactions.
| Property | Value |
|---|---|
| Handler | PromptEngine::render_str() -> LLM call |
| Context cost | Variable (depends on template + rendered context) |
| When used | Per-step during skill execution |
Model variants: A prompt_template artifact may have multiple rows with different model_variant values (e.g.,
"anthropic", "openai", "default"). The PromptEngine selects the best match for the active model. See
PromptEngine Simplification.
code_template
Section titled “code_template”A Python script template executed in the RLM sandbox (CPython-in-Wasmtime). Template variables are rendered before
execution. The script has access to the inklings host module for workspace queries and llm_query for sub-LM calls.
| Property | Value |
|---|---|
| Handler | Template render -> RlmExecutor::execute() |
| Execution context | WASM sandbox (Wasmtime) |
| Context cost | Zero (runs outside LLM context window) |
| When used | For workspace-scale analysis requiring iteration |
Security: Code templates run inside the WASM sandbox with fuel metering, memory limits, and no filesystem/network
access. The inklings Python module is the only bridge to workspace data.
Host functions available: workspace_search, get_page, get_pages, get_pages_by_type, get_tags,
get_references, get_history, llm_query, llm_query_batched, checkpoint, submit. See
Agent Core System — Host Functions for the complete list.
dspy_module
Section titled “dspy_module”A structured DSPy module that defines an LLM program with declarative signatures, chain-of-thought reasoning, and
optimizable prompts. Executed in the Python sidecar (apps/python-sidecar/), not the WASM sandbox, because DSPy
requires full Python ecosystem access (native C extensions like pydantic-core, numpy).
| Property | Value |
|---|---|
| Handler | run_skill host function -> DSPy sidecar |
| Execution context | Python sidecar (PyInstaller binary) |
| Context cost | Zero (runs outside LLM context window) |
| When used | For structured LLM programs with DSPy optimization |
Template-only execution model: The dspy_module artifact does NOT contain Python source code. Instead, it
references a known execution template by template_id and stores serialized DSPy state (JSON) in state_blob. The
sidecar ships with a fixed set of DSPy module class definitions (e.g., Predict, ChainOfThought, ReAct, custom
pipelines). At execution time, the sidecar instantiates the known class, calls .load() with the state blob, and
executes.
This eliminates the untrusted code execution path entirely — no user-supplied Python code enters the sidecar runtime.
Artifact schema:
content: JSON containingtemplate_id(references a shipped template) and execution parametersstate_blob: Serialized DSPy state (JSON frommodule.save(path, save_program=False)) — contains optimized few-shot demos, signature customizations, and LM settingsmetadata.fallback_artifact: Points to aprompt_templateartifact for graceful degradation
Execution architecture: The code_template or Orchestrator calls run_skill(skill_id, params) as a host function.
This proxies execution to the Python sidecar process. The sidecar receives the template_id, state_blob, and
parameters, instantiates the known template class, loads the optimized state, executes, and returns structured output.
LLM access: The sidecar uses InklingsLM, a custom DSPy LM subclass that routes all LLM completions through
bidirectional IPC back to Rust, where the ProviderRegistry (Rig-based) handles the actual API call. Zero API keys
exist in the sidecar environment.
Graceful degradation: Every dspy_module artifact must specify a fallback_artifact in its metadata field,
pointing to a prompt_template artifact in the same skill package. If the sidecar is unavailable (not installed, failed
to start), the system falls back to the prompt_template artifact transparently.
State management: DSPy modules persist optimized state in the state_blob column of skill_artifacts. The state is
portable JSON produced by module.save(path, save_program=False) — containing optimized few-shot demos, signature
definitions, and LM settings. Re-optimization is triggered by version upgrades (DSPy version bump in sidecar) via the
Skill Composer.
Version upgrade: When the sidecar ships a new DSPy version, the app detects the version change and re-runs
optimization for all skills with existing state_blob data via the Skill Composer. This is a background task on first
launch after update.
DSPy sidecar process lifecycle:
App startup | v[Sidecar not started - lazy] | v (first dspy_module execution or Skill Composer invocation)[Start DSPy sidecar] |-- PyInstaller binary ships with app as Tauri externalBin |-- No setup required — self-contained | v[Sidecar running - long-lived] |-- Two service modes: | |-- Execute: template_id + state_blob + params -> result | |-- Optimize: template_id + training_data -> state_blob (JSON) |-- All LLM calls routed through IPC -> Rust -> ProviderRegistry |-- Template manifest queryable via IPC | v (idle timeout or app shutdown)[Sidecar stopped]Python sidecar distribution:
The app ships a PyInstaller binary as a Tauri externalBin. This binary contains Python + DSPy + all dependencies
(pydantic-core, litellm, numpy) in a single executable. No Python installation, venv setup, or package management
required on the user’s machine.
| Property | Value |
|---|---|
| Distribution | PyInstaller single-file executable (~70 MB per platform) |
| Tauri config | externalBin in tauri.conf.json |
| Dependencies | Python 3.12+ runtime, DSPy, pydantic, litellm, numpy |
| Setup required | None — self-contained binary |
| Update mechanism | Ships with app updates; version bump triggers re-optimize |
run_skill host function:
The bridge between the WASM sandbox (or Orchestrator) and the Python sidecar:
Worker (in WASM or agent loop) | vrun_skill(skill_id, params) | v[Load artifact: template_id + state_blob] | v[IPC to DSPy sidecar process] | v[Sidecar instantiates known template class by template_id][Sidecar loads optimized state from state_blob JSON][Sidecar executes module with params][LLM calls routed back through IPC -> Rust -> ProviderRegistry] | v[Return structured result via IPC] | vWorker receives resultSidecar service modes:
- Execute:
{type: "execute", template_id, state_blob, params}-> instantiate template, load state, run, return result - Optimize:
{type: "optimize", template_id, training_data}-> run DSPycompile(), return newstate_blobJSON - Manifest:
{type: "manifest"}-> return list of available templates with their signatures
Two-Phase Activation
Section titled “Two-Phase Activation”Skill activation is split into two phases to minimize latency and token cost on every message.
Phase 1: Deterministic Metadata Match (Context Pipeline)
Section titled “Phase 1: Deterministic Metadata Match (Context Pipeline)”The Context Pipeline performs embedding/keyword match against the user message and the skill catalog — a lightweight index of skill names, descriptions, tags, intent patterns, and trigger phrases. Full artifact content is NOT loaded.
The pipeline returns ranked skill recommendations as part of the ContextPackage. This phase is deterministic (no LLM
call) and completes in under 100ms.
Input: user_message + skill_catalog_metadata (embedding/keyword index)Output: SkillRecommendation[] (ranked by relevance score)Phase 2: Artifact Selection (Orchestrator)
Section titled “Phase 2: Artifact Selection (Orchestrator)”The Orchestrator receives the pipeline’s recommendations in its ContextPackage. It decides whether to activate a skill
and, if so, which artifacts to load.
Artifact loading is lazy: only the artifacts needed for the current execution step are loaded from agents.db. A
multi-artifact skill may load its description first, then load prompt_template or code_template or dspy_module
artifacts one at a time during execution.
Input: ContextPackage (with skill_recommendations) + conversation historyOutput: Decision (direct response | activate skill S with artifacts [A1, A2, ...] | research | clarify)Artifact-Kind Dispatch
Section titled “Artifact-Kind Dispatch”There is no ExecutionMode enum. The skill system dispatches to the correct handler based on each artifact’s kind
field. A single skill execution may invoke multiple handlers in sequence (e.g., load description, render
prompt_template, execute code_template).
Kind-to-Handler Mapping
Section titled “Kind-to-Handler Mapping”| Artifact Kind | Handler | Executor | Output |
|---|---|---|---|
description | Context injection | N/A | Injected into Orchestrator/Worker context |
prompt_template | PromptEngine::render_str() | LLM provider | LLM response text |
code_template | Template render + RlmExecutor::execute() | Wasmtime (CPython) | Structured result from inklings.submit() |
dspy_module | run_skill host function | Python sidecar | Structured result from inklings.submit() |
Graceful Degradation
Section titled “Graceful Degradation”When a skill contains a dspy_module artifact, the dispatch logic checks whether the DSPy daemon is available:
- Daemon available: Execute
dspy_moduleartifact directly. - Daemon unavailable: Look up
fallback_artifactfrom the artifact’s metadata. Execute the referencedprompt_templateartifact instead. - No fallback specified: Return an error indicating the skill requires the Python sidecar.
This ensures skills work across all deployment environments, with DSPy-optimized execution where available and prompt-based fallback where not.
Dispatch Trait
Section titled “Dispatch Trait”#[async_trait]pub trait ArtifactHandler: Send + Sync { /// The artifact kind this handler processes. fn kind(&self) -> ArtifactKind;
/// Execute the artifact with the given context. /// Returns the handler's output (LLM response, RLM result, etc.). async fn execute( &self, artifact: &SkillArtifact, context: &ExecutionContext, ) -> Result<ArtifactOutput, SkillError>;}
pub struct ArtifactDispatcher { handlers: HashMap<ArtifactKind, Arc<dyn ArtifactHandler>>,}
impl ArtifactDispatcher { /// Dispatch an artifact to its registered handler. /// For dspy_module, checks daemon availability and falls back if needed. pub async fn dispatch( &self, artifact: &SkillArtifact, context: &ExecutionContext, ) -> Result<ArtifactOutput, SkillError> { let handler = self.handlers.get(&artifact.kind) .ok_or(SkillError::NoHandler(artifact.kind))?; handler.execute(artifact, context).await }}Storage Architecture
Section titled “Storage Architecture”Dual-Scope Storage
Section titled “Dual-Scope Storage”Skills are stored in agents.db at two levels, mirroring the memory architecture:
| Scope | Database | Contains |
|---|---|---|
| Account | ~/.inklings/agents.db | System skills, community skills, user account-level skills |
| Workspace | {workspace}/agents.db | Workspace-specific user skills, workspace overrides |
Both databases use the identical skills + skill_artifacts schema.
Resolution Order
Section titled “Resolution Order”When the Context Pipeline queries the skill catalog or the Orchestrator loads artifacts, skills are resolved in this order:
- Workspace
agents.db— workspace-specific skills take highest priority. - Account
agents.db— account-level skills (system, community, user).
If a skill with the same name exists at both levels, the workspace version wins. This allows users to fork a system
skill into their workspace for customization without affecting other workspaces.
SkillStorageRepository Trait
Section titled “SkillStorageRepository Trait”#[async_trait]pub trait SkillStorageRepository: Send + Sync { /// List all skills visible at this scope (metadata only, no artifacts). async fn list_skills(&self) -> Result<Vec<SkillSummary>, SkillStorageError>;
/// Get a skill by ID, including all its artifacts. async fn get_skill(&self, skill_id: &str) -> Result<Option<Skill>, SkillStorageError>;
/// Get a skill by name (for resolution-order lookups). async fn get_skill_by_name(&self, name: &str) -> Result<Option<Skill>, SkillStorageError>;
/// Get specific artifacts for a skill, filtered by kind. async fn get_artifacts( &self, skill_id: &str, kinds: &[ArtifactKind], ) -> Result<Vec<SkillArtifact>, SkillStorageError>;
/// Get a specific artifact by ID. async fn get_artifact(&self, artifact_id: &str) -> Result<Option<SkillArtifact>, SkillStorageError>;
/// Create a new skill (without artifacts). async fn create_skill(&self, skill: &Skill) -> Result<(), SkillStorageError>;
/// Update skill metadata. async fn update_skill(&self, skill: &Skill) -> Result<(), SkillStorageError>;
/// Delete a skill and all its artifacts (CASCADE). async fn delete_skill(&self, skill_id: &str) -> Result<(), SkillStorageError>;
/// Add an artifact to a skill. async fn add_artifact(&self, artifact: &SkillArtifact) -> Result<(), SkillStorageError>;
/// Update an artifact. async fn update_artifact(&self, artifact: &SkillArtifact) -> Result<(), SkillStorageError>;
/// Delete an artifact. async fn delete_artifact(&self, artifact_id: &str) -> Result<(), SkillStorageError>;}Resolved Skill Catalog
Section titled “Resolved Skill Catalog”The SkillCatalog aggregates both scopes and presents a unified view:
pub struct SkillCatalog { workspace_repo: Arc<dyn SkillStorageRepository>, account_repo: Arc<dyn SkillStorageRepository>,}
impl SkillCatalog { /// Build the merged catalog with workspace-wins resolution. pub async fn list_all(&self) -> Result<Vec<SkillSummary>, SkillStorageError> { let mut workspace_skills = self.workspace_repo.list_skills().await?; let account_skills = self.account_repo.list_skills().await?;
let workspace_names: HashSet<_> = workspace_skills.iter() .map(|s| s.name.clone()) .collect();
// Add account skills that don't have a workspace override. for skill in account_skills { if !workspace_names.contains(&skill.name) { workspace_skills.push(skill); } }
Ok(workspace_skills) }}Distribution and Sync
Section titled “Distribution and Sync”No include_str!()
Section titled “No include_str!()”System skills do not ship as compiled-in strings. They ship as seed data in the initial agents.db schema migration
and are refreshed from the cloud when connected. This avoids binary size bloat and enables skill updates without app
releases.
Distribution Channels
Section titled “Distribution Channels”| Channel | Mechanism | Offline Behavior |
|---|---|---|
| System skills | Seeded in agents.db migration + cloud refresh | Available from seed data |
| Community skills | Marketplace catalog (cloud) | Cached locally after first download |
| User skills | Local only (created by Skill Composer or manual import) | Always available |
Marketplace Sync
Section titled “Marketplace Sync”Community and system skills sync from the marketplace using ETags for efficient version checking:
- On app launch (or periodic background check), query marketplace for updated skills.
- Compare
sync_etagwith server response. - If changed, download updated skill package and upsert into account-level
agents.db. - Workspace-level overrides are never modified by marketplace sync.
Owner-Only Sync
Section titled “Owner-Only Sync”Skill sync between devices uses the existing Supabase sync infrastructure. Only the workspace owner syncs skills to the cloud — this prevents skill duplication across collaborators. Non-owners receive system and community skills through the marketplace channel.
PromptEngine Simplification
Section titled “PromptEngine Simplification”The PromptEngine is simplified from a struct-heavy rendering pipeline to a single render_str function that accepts
flexible context.
pub struct PromptEngine { engine: minijinja::Environment<'static>,}
impl PromptEngine { /// Render a template string with the given context values. /// Context is a flat key-value map -- the caller assembles it from whatever /// sources are relevant (user query, page content, memory, etc.). pub fn render_str( &self, template: &str, context: &HashMap<String, serde_json::Value>, ) -> Result<String, PromptEngineError>;}Key change: There is no rigid PromptContext struct. The caller assembles a HashMap<String, Value> from whatever
context is relevant to the current execution step. This eliminates the need to define and maintain a struct that
anticipates every possible context shape.
Model-Variant Selection
Section titled “Model-Variant Selection”When a prompt_template artifact has multiple rows with different model_variant values, the PromptEngine selects the
best match:
- Exact match on model family (e.g.,
"anthropic"for Claude models). - Fall back to
model_variant = NULL(the default variant). - If no default exists, use the first variant by ordinal.
This enables per-model prompt optimization without branching logic in the skill definition.
Execution Traces
Section titled “Execution Traces”Every artifact execution is recorded as a trace entry for debugging, cost tracking, and optimization feedback.
Trace Schema
Section titled “Trace Schema”CREATE TABLE skill_execution_traces ( id TEXT PRIMARY KEY, -- UUID session_id TEXT NOT NULL, -- Conversation session ID skill_id TEXT NOT NULL, -- Which skill was activated artifact_id TEXT NOT NULL, -- Which specific artifact was executed artifact_kind TEXT NOT NULL, -- Artifact kind for fast filtering
-- Execution details rendered_input TEXT, -- Rendered template (after variable substitution) raw_output TEXT, -- Raw LLM response or RLM result error TEXT, -- NULL on success; error message on failure
-- Timing started_at TEXT NOT NULL, completed_at TEXT, duration_ms INTEGER,
-- Cost input_tokens INTEGER, output_tokens INTEGER, model_id TEXT, -- Which model was used estimated_cost REAL, -- Estimated cost in USD
-- Assertions assertions_run INTEGER DEFAULT 0, assertions_passed INTEGER DEFAULT 0,
created_at TEXT NOT NULL DEFAULT (datetime('now')));
CREATE INDEX idx_traces_session_id ON skill_execution_traces(session_id);CREATE INDEX idx_traces_skill_id ON skill_execution_traces(skill_id);CREATE INDEX idx_traces_artifact_id ON skill_execution_traces(artifact_id);Per-Artifact Granularity
Section titled “Per-Artifact Granularity”Each artifact execution produces its own trace row. A single skill activation that executes 3 artifacts (description +
prompt_template + code_template) produces 3 trace rows, all sharing the same session_id and skill_id but with
different artifact_id values.
This enables:
- Per-artifact cost tracking — which artifacts consume the most tokens?
- Per-artifact timing — which artifacts are bottlenecks?
- Per-artifact assertion results — which artifacts fail validation?
- DSPy optimization feedback — traces feed back into offline prompt optimization.
Assertion Framework
Section titled “Assertion Framework”Assertions validate artifact outputs at execution time. They are defined per-artifact and checked after each artifact execution.
Assertion Types
Section titled “Assertion Types”| Type | Check | Example |
|---|---|---|
| Structural | JSON schema, required fields, format constraints | ”Output must contain a contradictions array” |
| Semantic | LLM-evaluated quality checks | ”Output must reference specific page names from the workspace” |
| Length | Token/character bounds | ”Output must be between 100 and 2000 characters” |
| Pattern | Regex match/no-match | ”Output must not contain markdown code fences” |
Assertion Schema
Section titled “Assertion Schema”Assertions are stored in the artifact’s metadata JSON field:
{ "assertions": [ { "type": "structural", "check": "json_schema", "schema": { "type": "object", "required": ["contradictions"], "properties": { "contradictions": { "type": "array" } } } }, { "type": "length", "check": "char_range", "min": 100, "max": 5000 }, { "type": "semantic", "check": "llm_eval", "prompt": "Does this output reference specific page names from the workspace? Answer yes or no.", "expected": "yes" } ]}Assertion Trait
Section titled “Assertion Trait”#[async_trait]pub trait Assertion: Send + Sync { /// Human-readable description of what this assertion checks. fn description(&self) -> &str;
/// Evaluate the assertion against an artifact output. /// Returns Ok(()) on pass, Err with explanation on failure. async fn evaluate( &self, output: &ArtifactOutput, context: &ExecutionContext, ) -> Result<(), AssertionFailure>;}
pub struct AssertionFailure { pub assertion_description: String, pub explanation: String, /// Whether this is a hard failure (abort) or soft failure (warn + continue). pub severity: AssertionSeverity,}
#[derive(Debug, Clone, Copy)]pub enum AssertionSeverity { /// Abort skill execution on failure. Hard, /// Log warning but continue execution. Soft,}DSPy Compatibility
Section titled “DSPy Compatibility”The assertion framework produces feedback compatible with DSPy’s offline optimization pipeline:
- Each assertion evaluation produces a binary pass/fail signal.
- Trace rows record
assertions_runandassertions_passedcounts. - Exported traces can be formatted as DSPy evaluation datasets for prompt optimization.
Skill Authoring
Section titled “Skill Authoring”The Skill Composer agent (see Process Model) is the primary interface for creating and modifying skills.
Authoring Workflow
Section titled “Authoring Workflow”- User initiates: “Create a skill that checks timeline consistency.”
- Orchestrator dispatches Skill Composer with the creation request.
- Skill Composer generates an initial skill package:
descriptionartifact (what the skill does)prompt_templateartifact (the main analysis prompt)- Optional
code_templateartifact (batch analysis script)
- Skill Composer validates the package (schema compliance, template syntax).
- Orchestrator presents the proposed skill to the user.
- User provides feedback (“add a dspy_module for optimized execution”).
- Orchestrator routes back to Skill Composer for iterative refinement.
- Skill Composer adds/modifies artifacts, re-validates.
- Repeat until the user approves.
- Skill is saved to the appropriate scope (workspace or account
agents.db).
DSPy Optimization (Skill Composer)
Section titled “DSPy Optimization (Skill Composer)”When the Python sidecar is available, the Skill Composer can optimize prompt_template artifacts using DSPy:
- Skill Composer generates a
dspy_moduleartifact wrapping the prompt logic. - DSPy compiles the module against assertion-based evaluation metrics.
- Optimized state is stored in the
state_blobcolumn. - The
prompt_templateartifact remains as fallback.
Manual Import
Section titled “Manual Import”Users can also create skills by importing JSON packages directly (e.g., shared by another user or exported from a different workspace). The import path validates the package schema before persisting.
Related Documents
Section titled “Related Documents”- Process Model — Agent types, especially Skill Composer and Worker
- Agent Core System — Parent system document
- Agent Memory System — Memory matrix and storage architecture
- LLM System — Multi-provider abstraction (PromptEngine uses model routing)
Was this page helpful?
Thanks for your feedback!