Agent

Agent Lifecycle

Covers the full agent harness lifecycle: starting the harness, sending messages, interrupt signaling, graceful stop, status polling, session state, single-session constraint, and error paths when the LLM provider is absent or the harness is not running. This spec is P0 because the agent harness is the primary intelligent assistant surface in the product — a regression in start/stop, interrupt, or error handling leaves users unable to use the agent at all, and a double-start or unchecked stop could leave the harness in a corrupt state that requires an app restart.

The harness lifecycle is driven by six Tauri commands: start_agent, stop_agent, send_agent_message, interrupt_agent, get_agent_status, and clear_conversation. All events from the agent loop arrive via the Tauri event bus (agent:status-changed, agent:message-chunk, agent:session-started, agent:session-completed, agent:session-interrupted, agent:tool-call, agent:permission-required). Status is also readable synchronously via get_agent_status, which returns { lifecycle, is_streaming }.

Preconditions

HTTP bridge running on port 9990
A workspace initialized via initialize_workspace before each scenario
Bridge shim injected via playwright.config.ts
LLM provider constraint: start_agent, send_agent_message, and related commands require a configured LLM provider (API key or Ollama URL). Scenarios that test real agent responses require a pre-configured stub or mock LLM provider. Scenarios that only exercise lifecycle state (start, stop, status) can use the StubLlmProvider path, which activates automatically when no API key is configured and returns a structured LlmError::Provider without crashing the harness. The HTTP bridge exposes the full agent lifecycle API.

Scenarios

Seed: seed.spec.ts

1. Start agent when no workspace is open returns an error

Attempting to start the agent before a workspace is open must be cleanly rejected.

Steps:

Do not initialize a workspace.
Call start_agent via the bridge.
Observe the response.

Expected: The command returns an error with a message indicating the agent manager is not initialized (workspace not open). The error is a structured CommandError::Internal with the message “Agent manager not initialized for current workspace”. No crash occurs. get_agent_status subsequently returns { lifecycle: "Stopped", is_streaming: false }.

2. Start agent with no LLM provider configured — stub fallback

When the user has not configured an API key, start_agent succeeds but installs a StubLlmProvider. The harness transitions to Running state.

Steps:

Initialize a workspace.
Ensure no agent API key is configured (default fresh workspace state).
Call start_agent via the bridge.
Observe the response and then call get_agent_status.

Expected: start_agent returns Ok(()) (not an error). get_agent_status returns { lifecycle: "Running", is_streaming: false }. The agent panel’s status dot would show green. No error message is shown yet — the stub failure surfaces only when a message is sent.

3. Start agent when already running returns AlreadyRunning error

The harness enforces a single-session constraint — only one harness instance may be active per workspace.

Steps:

Initialize a workspace.
Call start_agent to start the harness (succeeds).
Call start_agent a second time immediately.
Observe the second response.

Expected: The second start_agent call returns an error containing “already running”. The harness state remains Running. get_agent_status returns { lifecycle: "Running", is_streaming: false }. The harness is not in a corrupted or double-initialized state.

4. Get agent status — stopped state

Before the agent is started, get_agent_status reports Stopped.

Steps:

Initialize a workspace.
Do not call start_agent.
Call get_agent_status via the bridge.

Expected: The response is { lifecycle: "Stopped", is_streaming: false }. This is the default state immediately after workspace initialization.

5. Get agent status — running state

After a successful start_agent, get_agent_status reports Running.

Steps:

Initialize a workspace.
Call start_agent.
Call get_agent_status.

Expected: The response is { lifecycle: "Running", is_streaming: false }. The is_streaming field is false because no message has been sent yet.

6. Send message to agent that is not running returns an error

Sending a message when the harness is not running must be rejected, not silently dropped.

Steps:

Initialize a workspace.
Do not call start_agent.
Call send_agent_message with message: "Hello".
Observe the response.

Expected: The command returns an error with a message indicating “Agent harness is not running” or “not initialized”. No crash occurs. The agent store’s message list is unaffected (no user message should be added on a command error at the bridge level). get_agent_status remains { lifecycle: "Stopped", is_streaming: false }.

7. Send message to running agent with stub provider — provider error surfaced

When the agent is running with a StubLlmProvider (no API key configured), sending a message triggers the stub, which returns a structured LlmError::Provider("not configured"). The agent loop surfaces this as a session error event.

Steps:

Initialize a workspace with no LLM provider configured.
Call start_agent (succeeds, installs stub).
Call send_agent_message with message: "What is in my workspace?".
Observe the events emitted on the Tauri event bus (captured via the bridge’s SSE or polling endpoint, if available) and the final get_agent_status response.

Expected: The send_agent_message command returns Ok(()) immediately (it spawns the loop in a background task). The agent loop then encounters the stub provider error and emits an agent:status-changed event with an Error(...) variant or an agent:error event. get_agent_status subsequently may report { lifecycle: "Error(...)", is_streaming: false } or { lifecycle: "Running", is_streaming: false } depending on whether the harness resets after a single-turn error. No crash occurs. The conversation panel would show a system message like “Error: LLM provider not configured.”

8. Stop agent while idle — graceful stop

Stopping a running but idle agent transitions it to Stopped.

Steps:

Initialize a workspace.
Call start_agent.
Verify status is Running via get_agent_status.
Call stop_agent.
Call get_agent_status again.

Expected: stop_agent returns Ok(()). get_agent_status returns { lifecycle: "Stopped", is_streaming: false }. The harness is cleanly removed from the manager. A subsequent start_agent call succeeds (not rejected as “already running”).

9. Stop agent when not running is a no-op

Calling stop_agent when no harness is active must not produce an error.

Steps:

Initialize a workspace.
Do not call start_agent.
Call stop_agent.
Observe the response and then call get_agent_status.

Expected: stop_agent returns Ok(()). get_agent_status returns { lifecycle: "Stopped", is_streaming: false }. No error is returned for stopping an already-stopped harness.

10. Interrupt agent while idle is a no-op

Calling interrupt_agent when no session is actively processing must not error.

Steps:

Initialize a workspace.
Call start_agent.
Call interrupt_agent (no message has been sent — nothing is streaming).
Observe the response and then call get_agent_status.

Expected: interrupt_agent returns Ok(()). get_agent_status returns { lifecycle: "Running", is_streaming: false }. The interrupt was silently ignored because no session was running. The harness remains usable for a subsequent send_agent_message.

11. Interrupt agent while streaming — session stopped gracefully

Sending an interrupt signal while the agent loop is actively processing a message causes the loop to stop at its next checkpoint.

Steps:

Initialize a workspace with a mock LLM provider configured to stream a slow multi-chunk response (or use pre-recorded fixtures).
Call start_agent.
Call send_agent_message with a message that triggers a multi-turn response.
Immediately call interrupt_agent before the response completes.
Observe the Tauri events emitted and the final status.

Expected: After interrupt_agent, the agent loop checks the InterruptSignal at its next checkpoint and emits agent:session-interrupted. The agentStore finalizes the last streaming message (removing the blinking cursor) and adds a system message “Session interrupted.” get_agent_status returns { lifecycle: "Running", is_streaming: false } — the harness itself remains running; only the session was interrupted. A subsequent send_agent_message can start a new session.

Note: This scenario requires either a mock LLM provider with configurable latency or a pre-recorded fixture SSE stream. Without a real or mocked streaming provider, the agent loop may complete before interrupt_agent can be dispatched.

12. Start, stop, restart cycle — harness is reusable

The harness can be started, stopped, and started again cleanly.

Steps:

Initialize a workspace.
Call start_agent. Verify status is Running.
Call stop_agent. Verify status is Stopped.
Call start_agent again. Verify status is Running.
Call get_agent_status.

Expected: All three calls succeed. After the second start_agent, get_agent_status returns { lifecycle: "Running", is_streaming: false }. No “already running” error occurs because stop_agent cleared the previous harness instance.

13. Get agent status during streaming — is_streaming flag set

While the agent loop is actively processing a message, get_agent_status reports is_streaming: true.

Steps:

Initialize a workspace with a mock LLM provider that streams a response with deliberate pauses.
Call start_agent.
Call send_agent_message.
Immediately (within the same tick or via a polling loop) call get_agent_status.
Wait for agent:session-completed and call get_agent_status again.

Expected: During active streaming, get_agent_status returns { lifecycle: "Running", is_streaming: true } (or "Suspended"). After the session completes, it returns { lifecycle: "Running", is_streaming: false }. The is_streaming atomic flag correctly reflects the agent loop’s activity.

Note: Requires mock streaming provider.

14. Clear conversation — always succeeds

clear_conversation is a no-op until persistent history is implemented, but it must succeed and not error.

Steps:

Initialize a workspace.
Call start_agent.
Call clear_conversation.
Observe the response.

Expected: clear_conversation returns Ok(()). No error occurs. The agent store’s message list is cleared by the frontend caller (not by the backend command). get_agent_status is unaffected.

15. Get conversation history — returns empty list (no persistence yet)

get_conversation_history currently returns an empty list because the agent loop maintains in-memory state only.

Steps:

Initialize a workspace.
Call start_agent.
Call get_conversation_history.

Expected: The response is Ok([]) — an empty array. No error occurs. The command is safe to call even with no prior conversation. This is the documented behavior pending persistence implementation.

Test Data

Key	Value	Notes
lifecycle_stopped	”Stopped”	Default status before `start_agent`
lifecycle_starting	”Starting”	Transient state during harness initialization
lifecycle_running	”Running”	Ready and idle after successful start
lifecycle_suspended	”Suspended”	Harness has a session in progress (streamed from harness internals)
lifecycle_error	`{ "Error": "<message>" }`	Harness encountered a fatal error; message is the inner string
already_running_error	”already running”	Substring in error message from second `start_agent` call
not_running_error	”not running” or “not initialized”	Substring in error message from `send_agent_message` without prior start
stub_provider_signal	no API key, `api_key_configured: false`	Triggers `StubLlmProvider` path in `build_llm_provider`
default_model	”claude-sonnet-4-6”	Default model string in `AgentConfig::default()`
interrupt_event	”agent:session-interrupted”	Tauri event emitted when `InterruptSignal` fires
session_completed_event	”agent:session-completed”	Tauri event emitted on normal loop termination

Notes

HTTP bridge agent routes: The HTTP bridge exposes the agent lifecycle commands (start_agent, stop_agent, send_agent_message, interrupt_agent, get_agent_status, clear_conversation, get_conversation_history). All scenarios in this spec are exercisable via the bridge.
Single-session constraint: The AgentHarnessManager enforces one harness per workspace session. The RwLock<Option<AgentHarness>> is Some when running and None when stopped. start_agent returns AlreadyRunning if Some. There is no concept of concurrent agent sessions or session IDs at this layer.
send_agent_message is fire-and-forget: The Tauri command spawns the agent loop in a background task and returns Ok(()) immediately. All results arrive as Tauri events (agent:message-chunk, agent:session-completed, agent:session-interrupted, etc.). Tests of the full message-send flow that depend on real event streaming may still require Tauri integration or a mock event injection mechanism.
AgentLifecycleStatus serialization: The enum is #[derive(Serialize, Deserialize)]. Unit variants (Stopped, Starting, Running, Suspended) serialize as bare JSON strings. The Error(String) tuple variant serializes as { "Error": "<message>" }. Test assertions must handle both forms.
Interrupt is non-blocking: interrupt_agent calls manager.interrupt(), which uses try_read() on the harness lock and stores true into an AtomicBool. The command never blocks. The agent loop checks the flag at the top of each turn. If the loop completes between the flag set and the next checkpoint, the interrupt is silently ignored (no event emitted).
get_conversation_history returns [] until persistence is implemented: The command body is Ok(vec![]). Any test that asserts non-empty history after a conversation will fail. This is a known limitation tracked in the codebase comments.
clear_conversation is currently a no-op on the backend: The frontend clearMessages() action clears the in-memory Zustand store. The backend command logs a debug message and returns Ok(()). Tests should verify the command succeeds but cannot assert backend state changes until persistence is implemented.

Previous
Conversation UI Next
MCP Server

Was this page helpful?