Patterns

Production Readiness Review Checklist

Problem

Code review of the semantic search feature (INK-171 through INK-178) surfaced 8 recurring anti-patterns across 13 files. These patterns recur because they represent reasonable-looking code that breaks under production conditions (load, large data, malicious input, concurrent access).

Anti-Patterns and Fixes

1. Unbounded Channels

Anti-pattern: mpsc::unbounded_channel() or mpsc::channel() with .send().await

Why it breaks: Producer outpaces consumer → unbounded memory growth. .send().await blocks the producer (Tauri command) waiting for queue space → UI freeze.

Fix: Bounded channel + try_send() with explicit error handling:

const CHANNEL_CAPACITY: usize = 256;
match tx.try_send(cmd) {
    Ok(()) => {}
    Err(TrySendError::Full(_)) => tracing::warn!("queue full, dropping"),
    Err(TrySendError::Closed(_)) => tracing::warn!("worker dead"),
}

2. Per-Row Transactions

Anti-pattern: Loop calling repo.upsert(id, data) N times (N implicit transactions)

Why it breaks: Each implicit transaction triggers fsync. 1000 embeddings = 1000 fsyncs = 10-60 seconds. One transaction = one fsync = <1 second.

Fix: Provide upsert_batch() that wraps all writes in a single explicit transaction:

fn upsert_batch(&self, entries: &[(Uuid, Vec<f32>)]) -> Result<()> {
    self.db.with_transaction(|tx| {
        for (id, embedding) in entries {
            tx.execute("INSERT OR REPLACE INTO ...", params![id, embedding])?;
        }
        Ok(())
    })
}

Rule: Any repository method called in a loop needs a batch variant.

3. Unbounded Result Sets

Anti-pattern: SELECT * FROM pages WHERE stale = true returning all rows at once

Why it breaks: 10,000 stale pages → 10,000 rows in memory → allocation spike, potential OOM on constrained devices.

Fix: Add LIMIT parameter, have callers drive pagination:

fn get_stale_pages(&self, limit: usize) -> Vec<Uuid>;

// Caller loops until empty
loop {
    let batch = repo.get_stale_pages(100);
    if batch.is_empty() { break; }
    process(batch);
}

4. Blocking Drop

Anti-pattern: Drop impl that calls async shutdown and blocks on it

Why it breaks: Drop runs synchronously. Awaiting a future in Drop requires block_on(), which panics inside a Tokio runtime.

Fix: Two shutdown paths:

Drop → fire-and-forget try_send(Shutdown) (no await)
Explicit shutdown_and_wait() → sends command + awaits JoinHandle

5. Per-Request Service Construction

Anti-pattern: Building SearchRouter on every search command (clone repos, check provider, allocate)

Why it breaks: Unnecessary allocation per request. If provider check is expensive (file I/O, lock acquisition), adds latency to every search.

Fix: Cache in AppState, rebuild only on state change (workspace switch):

pub struct AppState {
    search_router: Mutex<Option<Arc<SearchRouter>>>,
}
// Built once at workspace open, cloned per request

6. Per-Call Configuration

Anti-pattern: Re-configuring tokenizer truncation/padding on every embed_batch() call

Why it breaks: Redundant allocation and configuration. If configuration fails intermittently, produces inconsistent behavior between calls.

Fix: Configure once at construction, store in Mutex:

impl OnnxProvider {
    fn new(model_path: &Path) -> Self {
        let mut tokenizer = Tokenizer::from_file(path)?;
        tokenizer.with_truncation(Some(TruncationParams { max_length: 512, .. }));
        tokenizer.with_padding(Some(PaddingParams { .. }));
        Self { tokenizer: Mutex::new(tokenizer), session }
    }
}

7. Missing Input Validation

Anti-pattern: Passing user query strings directly to embedding provider and database without length checks

Why it breaks: 1MB query string → expensive tokenization → large embedding computation → potential DoS.

Fix: Named constant + early return (not error):

const MAX_QUERY_LENGTH: usize = 10_000;
if query.len() > MAX_QUERY_LENGTH {
    return Ok(Vec::new());  // Silent empty results, not an error
}

Why not an error: Errors expose limits to adversaries. Empty results are indistinguishable from “no matches.”

8. Duplicate Test Mocks

Anti-pattern: Each test module defines its own MockPageRepository with 150+ lines of unimplemented!() methods.

Why it breaks: When PageRepository trait changes (new method), every duplicate mock must be updated. Easy to miss one, causing confusing compile errors.

Fix: Single test_helpers.rs at the crate root with flexible construction:

#[cfg(test)]
pub struct MockPageRepository {
    pub pages: Mutex<Vec<Page>>,
    pub search_results: Vec<SearchResult>,
}

impl MockPageRepository {
    pub fn new() -> Self { /* empty */ }
    pub fn with_pages(pages: Vec<Page>) -> Self { /* for embedding tests */ }
    pub fn with_search_results(results: Vec<SearchResult>) -> Self { /* for search tests */ }
}

Rule: If a mock is used in 2+ test modules, promote it to shared test_helpers.rs.

Quick Review Checklist

Use during code review for any new background system or repository:

#	Check	Severity
1	Channels bounded with `try_send()`	P1
2	Batch writes wrapped in single transaction	P1
3	Large result sets paginated with LIMIT	P2
4	Drop impl is non-blocking	P2
5	Expensive services cached, not rebuilt per request	P2
6	One-time config at init, not per call	P3
7	User input validated with named constants	P2
8	Mocks shared across test modules	P3

References

Commits: d23858b (wave 1), 7c95b0e (wave 2)
Linear issues: INK-205 through INK-222
Files: 13 files across src-tauri/, application/, infrastructure/onnx/, infrastructure/sqlite/

Previous
Partitioned Test Execution with Backend Data Isolation Next
TipTap Extension Option Mutation + No-Op Dispatch for NodeView Re-Render

Was this page helpful?