Agent Skills & Execution Modes
Agent Skills & Execution Modes
Covers the full skill lifecycle in the agent harness: loading system skills from the embedded registry; FreeForm,
Templated, and Blueprint execution modes; skill parameter passing; result display; catalog listing; invalid skill name
handling; execution errors; capability-gated skills; and the distinction between system skills and user-authored skills.
This spec is P1 because skills are the primary surface through which the agent’s behavior is customized — a broken skill
dispatch silently falls back to general_assistant or errors without the user understanding why their specialized skill
is not active.
Skills are defined using TOML frontmatter and a MiniJinja template body, stored as .skill files. System skills are
embedded at compile time via include_str!() in SkillRegistry::new(). The registry holds 6 system skills:
general_assistant, workspace_researcher, content_editor, proactive_organization_audit,
proactive_consistency_check, and proactive_relationship_discovery. Each skill declares an execution_mode
(FreeForm, Templated, or Blueprint), required_capabilities, and optionally typed parameters. Skills are
selected via the skill parameter on send_agent_message (maps to HarnessConfig::skill_id). The PromptEngine
renders the MiniJinja template with workspace context before the AgentLoop runs.
Preconditions
- HTTP bridge running on port 9990
- A workspace initialized via
initialize_workspacebefore each scenario - Agent harness started via
start_agent - Bridge shim injected via
playwright.config.ts
Scenarios
Seed: seed.spec.ts
1. Skill loading from registry — all six system skills are available at startup
The SkillRegistry::new() constructor parses all embedded .skill files and caches them in memory. All six system
skills must be present immediately after harness construction, before any user interaction.
Steps:
- Start the agent harness via
start_agent. - Send the message: “What skills do you have available?” (or invoke a bridge endpoint that lists skills if available).
- Observe the response or skill list.
Expected: The agent or API response confirms at least 6 skills are available. The IDs general_assistant,
workspace_researcher, content_editor, proactive_organization_audit, proactive_consistency_check, and
proactive_relationship_discovery are all present. No skill parse errors appear in the logs. Skills with unknown
capability strings are still loadable (unknown capability strings are skipped with a warning, not a fatal error).
2. FreeForm execution mode — agent uses system template as system prompt with full LLM latitude
The FreeFormExecutor passes the skill’s system_template directly to AgentLoop as the system prompt and lets the
LLM respond freely within the tool and turn budget. general_assistant is a FreeForm skill with
execution_mode = "FreeForm".
Steps:
- Send a message to the agent using the
general_assistantskill (default, no explicitskillparameter): “Help me brainstorm names for a dragon character.” - Observe the response.
Expected: The agent responds with creative dragon name suggestions — demonstrating full LLM latitude (FreeForm mode
applies no output template or structural constraints). The agent:session-started Tauri event fires with
skillId: "general_assistant". The response is conversational prose, not a structured template output. Token usage is
reported in the session completion event.
3. FreeForm execution mode — skill system_template is rendered with workspace context
The PromptEngine renders the skill’s system_template as a MiniJinja template with context variables including
workspace_name, available_tools, and unavailable_tools. The rendered prompt must reflect the actual workspace
name.
Steps:
- Create a workspace named “The Thornwood Chronicles”.
- Start the agent and send any message using
general_assistant. - Ask the agent: “What workspace are you working in?”
Expected: The agent responds with “The Thornwood Chronicles” (or references it in context). The MiniJinja
{{ workspace_name }} variable was correctly substituted during prompt rendering. The available tools list is populated
from the registry — no empty tool list.
4. Templated execution mode — skill requires RLM feature flag; graceful degradation when unavailable
The Templated execution mode requires the rlm Cargo feature to be enabled. When rlm is not compiled in, the
RlmUnavailableExecutor returns an error with a clear message rather than silently falling back to FreeForm.
Steps:
- Attempt to invoke a skill with
execution_mode = "Templated"in a build where therlmfeature is not enabled. - Observe the agent’s response.
Expected: The agent (or harness) surfaces the error: “skill requires Templated execution mode, which needs the ‘rlm’
feature to be enabled”. The stop_reason is RlmUnavailable. The conversation panel shows an error state or the agent
apologizes that the skill is currently unavailable. No silent fallback to FreeForm behavior occurs — the mode mismatch
is reported explicitly.
5. Blueprint execution mode — skill requires RLM feature flag; graceful degradation when unavailable
The Blueprint execution mode also requires the rlm Cargo feature. When unavailable, the same
RlmUnavailableExecutor path applies.
Steps:
- Attempt to invoke a skill with
execution_mode = "Blueprint"in a build where therlmfeature is not enabled. - Observe the agent’s response.
Expected: The harness returns a StopReason::RlmUnavailable error. The error message identifies the skill name and
states that Blueprint mode requires the RLM feature. The response is a clear error, not a garbled partial execution.
This mirrors the Templated scenario — both non-FreeForm modes require the same RLM infrastructure.
6. Skill parameter passing — parameters declared in frontmatter are substituted in the template
Skills can declare typed parameters in their TOML frontmatter. Required parameters without defaults must be provided
at invocation time. Parameters are injected into the MiniJinja template context.
Steps:
- Define a test skill (or use an existing skill that declares a parameter, such as a search query parameter).
- Invoke the skill via
send_agent_messagewith theskillparameter set and a message that maps to the declared parameter. - Observe the rendered system prompt (via log inspection or the agent’s response phrasing).
Expected: The agent’s behavior reflects the injected parameter. For example, if a skill declares
name = "query", type = "string", required = true, the template renders the query in the system prompt correctly.
Missing a required parameter without a default results in a template render error
(HarnessError::TemplateRenderFailed). Optional parameters with defaults fall back to their declared default values
when not provided.
7. Skill result display — execution result is shown in the conversation panel
The ExecutionResult from a skill run includes output (final text), stop_reason, execution_time_ms, and usage
(token counts). These must be surfaced in the conversation panel in a readable form.
Steps:
- Send a message to the agent using the
workspace_researcherskill: “Research all pages that mention kings and kingdoms.” - Wait for the agent session to complete.
Expected: The conversation panel shows the agent’s research findings (based on ExecutionResult::output). The
agent:session-completed Tauri event fires with stopReason: "end_turn" (or "max_turns" if the turn budget was
exhausted). No raw JSON is shown to the user — the output is rendered as human-readable text. Token usage is logged
internally.
8. Skill catalog listing — all available skills are enumerable
The SkillRegistry::list() method returns metadata for all cached skills. A UI that surfaces skill selection must be
able to enumerate all available skills.
Steps:
- Open the agent panel and navigate to the skill selector (if exposed in the UI) or query the available skills via the agent API.
- Observe the listed skills.
Expected: At least the 6 system skill names appear: General Assistant, Workspace Researcher, Content Editor,
Proactive Organization Audit, Proactive Consistency Check, Proactive Relationship Discovery. Each entry shows the
skill’s name, description, and version from the TOML frontmatter. Skills sourced from SkillSource::System are
visually distinguished from user-created (SkillSource::User) skills if both types are present.
9. Invalid skill name handling — requesting a non-existent skill fails gracefully
When send_agent_message is called with a skill ID that does not exist in the SkillRegistry, the harness must
return a meaningful error rather than proceeding with an empty system prompt.
Steps:
- Send a message to the agent with an invalid skill ID:
send_agent_message("Do something", skill: "does_not_exist_skill_xyz"). - Observe the agent panel or API response.
Expected: The harness returns a HarnessError::SkillFetchFailed error (or equivalent). The error message identifies
the unknown skill ID. The conversation panel shows an error state or a message explaining the skill is unavailable. No
session starts with an empty system prompt — the failure is caught before AgentLoop is invoked.
10. Skill execution error handling — LLM error during FreeForm execution surfaces cleanly
When the underlying LLM call fails during FreeForm skill execution (e.g., provider not configured, API key expired), the
FreeFormExecutor surfaces the error as a HarnessError::AgentError(AgentError::LlmError(...)).
Steps:
- Configure the agent with a stub LLM provider that always fails (
StubLlmProviderwith no API key configured). - Send any message using any skill.
- Observe the conversation panel response.
Expected: The agent panel shows an error state (not a silent hang). The error message indicates the LLM provider is
not configured or the API call failed. The agent:status-changed event fires with status "error". No partial or
garbled response is displayed. The harness remains operable — a subsequent start_agent call (after configuring a valid
provider) can recover the session.
11. Skill with required capabilities — skill is gated by the agent’s permission guard
Skills declare required_capabilities in their TOML frontmatter (e.g., ["PagesRead", "SearchUse"]). When the active
permission guard does not include a required capability, the skill’s tools are marked unavailable and the skill may
decline to execute certain operations.
Steps:
- Start the agent with a restricted permission guard that excludes
PagesWrite. - Invoke the
content_editorskill (which requires write capability) with the message: “Edit the introduction of the ‘Dragon Lore’ page to add a new paragraph.” - Observe the agent’s response.
Expected: The skill loads and runs (the harness does not block skill activation based on capabilities — only
individual tool calls are gated). However, tools that require PagesWrite are marked
[UNAVAILABLE: Requires PagesWrite capability] in the LLM schema. The agent responds that it cannot make the edit due
to permission restrictions. The page content is not modified.
12. System skills vs user skills — distinction is maintained in storage and listing
SkillSource::System skills are embedded in the binary and never written to the SkillStorageRepository.
SkillSource::User skills are created by users and persisted in agents.db. The listing must correctly report the
source for each skill.
Steps:
- Check that
general_assistantis listed as a system skill (source:system). - Create a user skill via the UI or API with a custom ID (e.g.,
my_custom_skill) and a FreeForm template. - List all skills and filter by source.
Expected: general_assistant (and the other 5 system skills) appear with source: "system". The newly created
skill appears with source: "user". System skills and user skills are independently filterable via
list_skills(source: Some(SkillSource::User)). Deleting the user skill via delete_user_skill removes it from the user
list without affecting system skills. System skills cannot be deleted (they are binary-embedded, not stored in the DB).
Test Data
| Key | Value | Notes |
|---|---|---|
| system_skill_general_assistant | general_assistant | Default skill; FreeForm; no required_capabilities |
| system_skill_workspace_researcher | workspace_researcher | FreeForm; search/read-focused |
| system_skill_content_editor | content_editor | FreeForm; requires PagesRead + PagesWrite for full operation |
| system_skill_proactive_org | proactive_organization_audit | Used by proactive suggestion engine |
| system_skill_proactive_consist | proactive_consistency_check | Used by proactive suggestion engine |
| system_skill_proactive_rel | proactive_relationship_discovery | Used by proactive suggestion engine |
| total_system_skills | 6 | Minimum count enforced by registry tests |
| freeform_executor_default_model | claude-sonnet-4-6 | Default in FreeFormConfig::default() |
| freeform_executor_default_turns | 20 | Default max_turns in FreeFormConfig::default() |
| harness_default_turns | 20 | HarnessConfig::default() max_turns |
| skill_file_format | TOML frontmatter between --- delimiters | id, name, description, version, execution_mode, required_capabilities |
| rlm_feature_flag | rlm (Cargo feature) | Required for Templated and Blueprint modes; absent = RlmUnavailableExecutor |
| stop_reason_rlm_unavailable | RlmUnavailable | Returned when Templated/Blueprint skill runs without rlm feature |
| skill_source_system | system | Binary-embedded; not stored in DB; not deletable |
| skill_source_user | user | Created by user; persisted in agents.db via SqliteSkillRepository |
| skill_source_marketplace | marketplace | Downloaded from marketplace; cached in agents.db |
| invalid_skill_error | HarnessError::SkillFetchFailed | Returned when skill_id not found in registry or DB |
Notes
- Skills are selected via the optional
skillparameter ofsend_agent_message. WhenNoneis provided, the harness defaults to"general_assistant"(seesend_message()inapps/desktop/src-tauri/src/agent.rs). The test for invalid skill names must pass a non-None skill ID. - The
PromptEnginerenders the skill’ssystem_templateusing MiniJinja. Template variables includeworkspace_name,available_tools(list of available tools), andunavailable_tools(list of(tool, reason)pairs). The{% block %}syntax in system skills allows sub-skills or overrides. - FreeForm skills use
FreeFormExecutor, which wrapsAgentLoopfromagent-core. TheFreeFormConfig::default()setsmodel = "claude-sonnet-4-6"andmax_turns = 20. These can be overridden by theHarnessConfigpassed tosend_message. - Templated and Blueprint modes are gated behind the
rlmCargo feature. In the current production build of the desktop app, whetherrlmis enabled determines whether these modes are functional. E2e tests for Templated/Blueprint scenarios should detect the feature flag state and adjust expectations accordingly. - The
SkillRegistryuses a cache-first lookup: system skills loaded at construction time are never replaced by cloud fetches for the same ID (cloud fetcher is only called on cache miss). This prevents cloud skills from shadowing system skills. required_capabilitiesin the skill frontmatter are parsed asdomain::Capabilityenums (e.g.,"PagesRead"→Capability::PagesRead). Unknown capability strings are silently skipped with awarn!log — they do not fail the skill parse. Skills with emptyrequired_capabilitiesrun against any permission guard.- The HTTP bridge exposes skill management routes including skill catalog listing. Skill listing is also accessible via
the agent conversation itself (“what skills do you have?”). Direct DB queries against
agents.db’s skills table can verifySkillSourcefor user/marketplace skills. ExecutionTrace(fromskill/traces.rs) records the full execution trace for Blueprint mode and is used by DSPy-style assertion checking. FreeForm mode does not populatefuel_consumed(that field is RLM-only). E2e tests for FreeForm can assertfuel_consumed: nullin completion events if the event payload exposes this field.
Was this page helpful?
Thanks for your feedback!