Skip to content
Documentation GitHub
Agent

LLM System


The LLM system provides a multi-provider abstraction for interacting with large language models. It handles API key management, provider registration, request construction, streaming responses, and rate limit tracking. Built on the rig-core crate for underlying provider implementations.

The system supports cloud providers via BYOK (Bring Your Own Key) — users configure their own API keys stored securely in the OS keychain — Ollama for local model inference with no API key required, and OpenRouter as an aggregator gateway providing access to 100+ models from multiple providers via a single API key or OAuth PKCE flow.

Five providers are supported: Anthropic (Claude), OpenAI (GPT), xAI (Grok), Ollama (local models), and OpenRouter (aggregator). Cloud providers (Anthropic, OpenAI, xAI, OpenRouter) use ProviderRegistry, built from OS keychain keys. Ollama uses OllamaProvider directly without keychain involvement.

OpenRouter provides access to 100+ models from multiple underlying providers via a single API key. It uses Rig’s native rig::providers::openrouter module — not the OpenAI-compatible wrapper.

API keys are stored in the OS keychain (macOS: Security.framework), never in the settings JSON file. The KeyStore trait abstracts key retrieval:

#[async_trait]
pub trait KeyStore: Send + Sync {
async fn get_key(&self, provider: ProviderKind) -> Option<String>;
}

KeychainKeyStore in the Tauri framework layer implements both:

  • application::settings::KeychainStore (sync, for Tauri command operations)
  • infrastructure_llm::KeyStore (async, for provider registry construction)

This dual-trait pattern lets a single concrete adapter serve two distinct layer boundaries.

The rig-core crate provides provider-specific API adapters. We use it strictly as a thin LLM abstraction (CompletionModel + provider routing) — the agent loop, tool system, and session management are all custom.

When no API key is configured, build_llm_provider() returns a StubLlmProvider that returns LlmError::Provider for every call. This avoids Option<Arc<dyn LlmProvider>> throughout the codebase — the harness always has a provider, but it may be non-functional.

The StubLlmProvider is feature-gated behind test-support to keep it out of the library’s default surface.

Ollama runs locally and requires no API key. The OllamaProvider wraps rig-core’s Ollama adapter and is registered directly without consulting the KeyStore. Key design elements:

  • No keychain involvement: OllamaHttpClient talks directly to the Ollama HTTP API (http://localhost:11434 by default, overridable via AgentSettings.ollama_url)
  • Hardware-aware recommendations: SystemCapabilities::detect() reads system RAM via the sysinfo crate and classifies the machine into HardwareTier (Low/Medium/High). The static RecommendedModel catalog (6 models) maps tiers to appropriate model sizes
  • Static curated catalog: Model recommendations are compiled into the binary — no network call required to display suggestions
  • Streaming model pulls: Model download uses the Ollama /api/pull endpoint with NDJSON streaming, surfaced through pull_ollama_model Tauri command with progress events

6. OpenRouter — OAuth PKCE or Manual Key

Section titled “6. OpenRouter — OAuth PKCE or Manual Key”

OpenRouter is a model aggregator supporting 100+ models from providers including Anthropic, OpenAI, Google, Meta, and others — all accessible via a single API key. Two authentication paths are supported:

  • OAuth PKCE flow (recommended): The user clicks “Connect with OpenRouter” in the agent settings UI. The app generates a PKCE code verifier/challenge, opens the OpenRouter authorization URL in the default browser, receives the callback via the inklings:// deep link scheme, and completes the token exchange. The resulting access token is stored in the OS keychain.
  • Manual key entry: Standard set_api_key flow used by all other cloud providers.

OpenRouter defaults to "openai/gpt-4o" as the initial model. Users can change the model string to any OpenRouter-supported model identifier.

The OAuth PKCE flow is implemented in apps/desktop/src-tauri/src/openrouter_auth.rs (framework layer), not in the infrastructure-llm crate.

1. User saves API key via UI
→ validate_api_key (async, hits provider API)
→ set_api_key (stores in keychain, sets api_key_configured=true)
OR (OpenRouter only):
→ start_openrouter_auth (PKCE initiation, opens browser)
→ complete_openrouter_auth (token exchange, stores in keychain, sets api_key_configured=true)
2. Agent starts (start_agent command)
→ build_llm_provider() checks api_key_configured flag
→ If true: ProviderRegistry::from_keys(&key_store)
→ Iterates Anthropic/OpenAI/xAI/OpenRouter, queries keychain for each
→ Builds LlmModel per found key
→ Wraps in RegistryProvider adapter → Arc<dyn LlmProvider>
→ If false: StubLlmProvider → Arc<dyn LlmProvider>
3. Agent loop calls provider.complete() / provider.stream()
→ RegistryProvider delegates to ProviderRegistry
→ Registry selects model by provider kind
→ Model calls rig-core provider API
ModuleResponsibility
provider.rsProviderKind enum (type alias for domain::AgentProvider)
registry.rsProviderRegistry — maps provider kinds to configured LlmModel instances
model.rsLlmModel — wraps rig-core completion model with provider metadata
request.rsLlmRequest, LlmMessage, ToolDefinition, CacheHint
response.rsLlmResponse, ContentBlock, StopReason, TokenUsage
streaming.rsLlmStream and StreamChunk for streaming completions
key_store.rsKeyStore trait + InMemoryKeyStore for testing
rate_limit.rsRateLimitTracker — per-provider rate limit state
validation.rsvalidate_api_key() — lightweight API call to verify key validity
providers/Provider-specific adapters (anthropic, openai, xai) + shared common.rs
providers/ollama.rsOllamaProvider — rig-core Ollama adapter for completions
providers/openrouter.rsOpenRouterProvider — wraps rig::providers::openrouter::Client (native Rig module, not OpenAI-compat wrapper)
ollama_client.rsOllamaHttpClient — health checks, model listing, streaming model pulls
hardware.rsSystemCapabilities detection and HardwareTier classification
stub.rsStubLlmProvider — feature-gated no-op provider

The request model supports CacheHint annotations on messages for providers that support prompt caching (Anthropic). System prompts and tool definitions can be marked as cacheable, reducing token costs for repeated interactions.

RateLimitTracker maintains per-provider rate limit state extracted from API response headers. When a provider returns rate limit headers, the tracker records:

  • Remaining requests/tokens
  • Reset timestamps

The agent loop can query the tracker before sending requests to avoid hitting limits.

ErrorVariantHandling
Invalid API keyLlmError::AuthFailureReturned during validation; UI shows error
Rate limitedLlmError::RateLimitedReturned during validation or runtime
Provider errorLlmError::ProviderGeneric provider-specific error
Network errorLlmError::NetworkConnection failure
Unsupported providerLlmError::UnsupportedProviderNo keys found for any provider
SystemRelationship
User SettingsKeychainStore trait for key CRUD; api_key_configured flag; ollama_url for custom endpoint
Agent CoreRegistryProvider adapter bridges ProviderRegistryLlmProvider trait
Agent Harnessbuild_llm_provider() constructs the provider at harness startup
DomainAgentProvider enum (Anthropic, OpenAi, Xai, Ollama, OpenRouter); OllamaStatus, OllamaModelInfo, PullProgress, RecommendedModel, HardwareTier, ModelCategory, SystemCapabilities in crates/domain/src/ollama.rs
OpenRouter OAuthopenrouter_auth.rs in src-tauri handles PKCE flow; inklings:// deep link scheme receives OAuth callback
OllamaOllamaHttpClient for health (/api/tags), model listing, and streaming pulls (/api/pull)
  • Unit tests: InMemoryKeyStore for testing key retrieval without OS keychain
  • Integration tests: Provider construction with mock key stores
  • No live API tests: Validation and provider calls are not tested against real endpoints in CI
Terminal window
cargo test -p infrastructure-llm

LLM system is required by Agent Core and Agent Harness.

Was this page helpful?