Reference

QA Testing System

Linear Epic: INK-358 (Sentinel Conversion) Status: Playwright Test Agents with Sentinel wrapper Depends On: HTTP Bridge (apps/http-bridge/), Vite dev server, Playwright

Overview

The QA testing system uses Playwright Test Agents wrapped by @qa-sentinel skills for browser-based testing of the full Inklings UI against the real Rust backend, without requiring a Tauri runtime.

Test Pyramid

A standalone HTTP bridge replaces Tauri’s IPC, and a browser shim (bridge-inject.js) intercepts the frontend’s invoke() calls, routing them to the bridge via HTTP.

Two-Layer Architecture

@qa-sentinel (skill layer): Manages the testing lifecycle — discovery, scaffolding, drift detection
Playwright Test Agents (execution layer): Test planning, code generation, execution, and self-healing

Key principle: AI is used at authoring and maintenance time only. Test execution is npx playwright test — deterministic, fast, no LLM costs.

Architecture

Playwright (headless browser)
├── Vite dev server (localhost:1420)
│   └── React frontend + bridge-inject.js shim
│       └── HTTP bridge (localhost:9990–9993)
└── Axum → BridgeState → Use Cases → SQLite

How It Works

Vite serves the React frontend at localhost:1420 (no Tauri)
bridge-inject.js is injected via Playwright’s addInitScript before any page scripts load
The shim replaces window.__TAURI_INTERNALS__.invoke with HTTP fetch() calls to the bridge
HTTP bridge receives commands at POST /invoke/{command}, executes them against real SQLite, returns JSON
Playwright drives the UI via standard locators and assertions

Layer Mapping

Layer	Component	Location
Browser shim	`bridge-inject.js`	`apps/http-bridge/bridge-inject.js`
HTTP server	Axum router + route handlers	`apps/http-bridge/src/`
Application state	`BridgeState` (mirrors Tauri `AppState`)	`apps/http-bridge/src/state.rs`
Route handlers	83 commands across 7 modules	`apps/http-bridge/src/routes/`
NL test specs	Natural language scenario definitions	`tests/e2e/specs/`
Generated tests	Playwright `.spec.ts` files	`tests/e2e/tests/`
Sentinel config	Project + partition config	`tests/e2e/sentinel.config.yaml`

HTTP Bridge

Building and Starting

# Build
cargo build -p http-bridge

# Start with a workspace
cargo run -p http-bridge -- --port 9990 --workspace .data/workspaces/qa-pilot

# Verify
curl -s http://localhost:9990/health  # → "ok"

CLI Flags

Flag	Default	Description
`--port`	9990	HTTP listen port
`--workspace`	`.data/workspaces/default`	Workspace directory (auto-initialized)

Command Tiers

Tier	Count	Description
Tier 1 (Full)	~70	Fully implemented with real backend logic
Tier 2 (Stub)	~10	Return sensible defaults (e.g., `get_settings`)
Tier 3 (501)	~2	Not applicable outside Tauri (`get_sync_status`, `refresh_recent_workspaces_menu`)

Serialization Convention

The bridge uses #[serde(rename_all = "camelCase")] on all Deserialize structs with multi-word fields, matching Tauri’s automatic camelCase ↔ snake_case translation. This is critical — the React frontend sends camelCase JSON.

Running Tests

Prerequisites

Rust toolchain (for bridge binary)
Node.js + pnpm (for Vite dev server)
Playwright installed in tests/e2e/

Quick Start

pnpm test:e2e handles bridge startup automatically via Playwright webServer config.

# Run all tests (from project root)
pnpm test:e2e

# Or from tests/e2e/
cd tests/e2e && pnpm test

Partitioned Execution

Tests run across 4 isolated bridge instances for data isolation:

Partition	Bridge Port	Test Scope
workspace-pages	9990	`workspace-*.spec.ts`
editor-content	9991	`editor-*.spec.ts`
navigation-search	9992	`navigation-*.spec.ts`
advanced-features	9993	`advanced-*.spec.ts`

Configuration in tests/e2e/playwright.config.ts maps each project to its bridge port and test directory.

Bridge Shim Injection

The shim is injected via Playwright’s addInitScript in the test fixtures. It must run before the React app loads:

// Prepend to bridge-inject.js contents:
window.__BRIDGE_BASE_URL = "http://localhost:9990";

// Then the full bridge-inject.js contents

Supabase Stub Environment

The frontend requires SUPABASE_URL and SUPABASE_PUBLISHABLE_KEY at build time. Use stub values:

SUPABASE_URL="http://localhost:54321"
SUPABASE_PUBLISHABLE_KEY="stub-anon-key-for-qa"

The bridge itself falls back to a stub SupabaseConfig when env vars aren’t set and uses in-memory auth.

Sentinel Skills

Three skills manage the testing lifecycle:

Skill	Purpose
`qa-sentinel:setup`	Scaffold project: Playwright agents, NL spec dirs, config
`qa-sentinel:discover`	Critical path + persona analysis → NL test specs
`qa-sentinel:audit`	Drift detection (NL specs ↔ app ↔ tests) → corrections

NL Spec Format

Natural language specs in tests/e2e/specs/ are the durable contract between discovery and test generation. They use YAML frontmatter + markdown with numbered H3 scenarios. See any spec file for the canonical format.

Test Generation Pipeline

NL Specs → Playwright Planner → Test Plans → Playwright Generator → .spec.ts

Generated .spec.ts files are regenerable from specs. NL specs survive Playwright API changes and UI restructures.

Known Bridge Gaps

Gap	Reason	Impact
Tauri events (`emit`/`listen`)	Stubbed — no event bus in bridge	Event-driven UI features won’t fire
Sync lifecycle	No real sync engine in bridge	Sync status always “not connected”
Native dialogs	No Tauri window API	File browse dialogs don’t open
Window menu refresh	No native menu	Recent workspaces menu not updated
React strict mode	Double-invokes effects in dev	First `initialize_workspace` may hit “database is locked” — second succeeds

Adding New Commands

When new Tauri commands are added to apps/desktop/src-tauri/src/commands/:

Add the route handler in the appropriate apps/http-bridge/src/routes/*.rs module
Register the route in apps/http-bridge/src/router.rs
Add #[serde(rename_all = "camelCase")] to any Deserialize structs with multi-word fields
Test via curl -s -X POST localhost:9990/invoke/{command} before browser testing

This manual sync is tracked as a maintenance concern (see INK-234 for macro-based generation).

Tauri-Native Smoke Tests

These scenarios run in the real Tauri app (not the HTTP bridge) to verify behaviors the bridge cannot replicate: event propagation, background workers, and native dialogs.

Prerequisites

Real Tauri app launched via tools/qa/start-tauri-smoke.sh
Safari Web Inspector connected to the Tauri webview

Connecting to the Tauri Webview

On macOS, Tauri uses WKWebView — Chrome DevTools Protocol (CDP) cannot connect to it. Use Safari Web Inspector:

Open Safari
Safari menu → Settings → Advanced → check “Show features for web developers”
Safari menu → Develop → {machine name} → localhost
Use the Web Inspector console for debugging and verification

Smoke Test Scenarios

#	Scenario	Validates	Automation
1	Open workspace → sidebar populates	Event propagation → React state	Partial — verify page count in DB
2	Create page → embedding status	EmbeddingManager signal	Partial — check embedding queue
3	Export pages → native dialog	Native dialog integration	Manual — visual confirmation

INK-358 Sentinel Conversion — Complete

Status: Done (2026-02-24)

All 7 child issues resolved:

INK-405: Foundation specs and HTTP bridge scaffolding ✓
INK-406: Phase 1 tests (6 spec files) ✓
INK-407: Phase 2 tests (4 spec files) ✓
INK-408: Phase 3 tests (4 spec files) ✓
INK-409: Phase 4 cleanup and finalization ✓
INK-410: Sentinel-driven test maintenance ✓
INK-411: Playwright agents and healing ✓

Final test suite contains 301 tests across 17 spec files with a 90% pass rate (271 active tests passing). Fixme tests (25) are tracked by category; skip tests (5) correspond to unimplemented UI. The suite is production-ready for pre-release validation.

Pre-Release Gate

A quality gate that must pass before any release. Run as part of the release checklist.

Gate Process

# 1. Run full test suite (Playwright handles bridge startup)
pnpm test:e2e

# 2. Review results
#    - All tests must PASS or be in a known fixme/skip state
#    - Zero unexpected failures
#    - Check: test-results/.last-run.json should show "status": "passed"

# 3. Generate HTML report for review (optional)
cd tests/e2e && npx playwright test --reporter=html
# Open playwright-report/index.html

Gate Criteria

Criterion	Threshold	Notes
Pass rate (runnable)	100%	All non-fixme, non-skip tests must pass
No new failures	Zero	Compare against previous run
Fixme tests reviewed	Yes	Ensure no fixme tests should be re-enabled
Bridge health	All 4 UP	All partition bridge instances responding

Suite Inventory (INK-358 Complete)

Metric	Count
Total tests	301
Active (passing)	271
Fixme (known limitation)	25
Skip (no UI built)	5
Test files	17 spec files
Suite duration	~2.5 min

Fixme Breakdown (25 tests):

Loro CRDT: 12 tests
Import edge cases: 7 tests
Multi-workspace lifecycle: 3 tests
Cloud-only: 2 tests
FTS snippet: 1 test

Skip Breakdown (5 tests):

Page history sub-view unimplemented: 5 tests

Test File Distribution (17 spec files):

Spec File	Active	Fixme	Skip	Notes
smoke-test.spec.ts	8	—	—
navigation-command-palette.spec.ts	14	—	—
workspace-trash.spec.ts	13	—	—
navigation-first-launch.spec.ts	12	—	—
navigation-sidebar.spec.ts	11	—	—
advanced-status-displays.spec.ts	31	2	—
workspace-operations.spec.ts	38	3	—
navigation-first-launch-tour.spec.ts	9	—	—	serial
workspace-page-hierarchy.spec.ts	20	—	—
advanced-import.spec.ts	10	7	—	serial
navigation-search.spec.ts	14	1	—	serial
editor-persistence.spec.ts	12	2	—
editor-formatting.spec.ts	20	1	—
editor-wiki-links.spec.ts	16	7	—	serial
workspace-page-crud.spec.ts	25	1	—
editor-page-attributes.spec.ts	17	2	—
TOTAL	250	25	0

When to Run

Before every release build
After significant UI or backend changes
After dependency upgrades (Playwright, Tauri, Loro)
Weekly as a regression check (recommended)

Failure Response

Failure Type	Action
Locator/selector mismatch	Run Playwright Healer or fix manually
Timing/race condition	Increase timeout or add waitForSelector
App regression	File a Linear issue (Team: Inklings, Label: Type:Bug)
Bridge gap (new)	Mark as `test.fixme()` with explanation
Infrastructure failure	Check bridge health, restart infrastructure

Audit Integration

Run /qa-sentinel:audit periodically to detect:

Spec → Test drift: NL spec scenarios without matching tests
Test → Spec drift: Orphaned tests without NL spec backing
Behavioral drift: Test failures indicating app changes vs stale specs

Infrastructure Scripts

Infrastructure startup is handled automatically by Playwright’s webServer config. precheck.sh is the canonical backend quality gate.

Script	Purpose
`tools/precheck.sh`	Backend quality gate (clippy, fmt, lint, test)
`tools/qa/stop-infrastructure.sh`	Emergency cleanup of orphaned bridge processes
`tools/qa/health-check.sh`	Verify bridge health (`/health` endpoint)
`tools/qa/start-tauri-smoke.sh`	Start real Tauri app for native smoke tests
`tools/qa/bridge-query.sh`	Test bridge endpoint (curl wrapper)

Previous
Python Sidecar IPC Reference Next
Tauri Command Reference

Was this page helpful?