Skip to content
Documentation GitHub
Platform

Model Downloader System

Status: Implemented File: apps/desktop/src-tauri/src/model_downloader.rs


The model downloader automatically fetches the snowflake-arctic-embed-m-v2.0 ONNX embedding model (~585 MB) from HuggingFace on first workspace open. This is a first-run UX concern: the model is required for semantic search but too large to bundle with the application binary.

The download is gated by the semantic_search_enabled setting (default true). When disabled, the search system falls back to FTS5-only search with no download attempt.

PropertyValue
Modelsnowflake-arctic-embed-m-v2.0
FormatONNX (FP16)
Size~585 MB (model.onnx) + ~2 MB (tokenizer.json)
Embedding dim768
Token context512 tokens
Filesmodel.onnx, tokenizer.json
Model SHA-256f27ab40ab6e230265ba49a202a37f1ad031556256cbbc105d0ca9c0bdc7ec42e
Tokenizer SHA-256f1cc44ad7faaeec47241864835473fd5403f2da94673f3f764a77ebcb0a803ec
Model URLhttps://huggingface.co/Snowflake/snowflake-arctic-embed-m-v2.0/resolve/main/onnx/model_fp16.onnx
Tokenizer URLhttps://huggingface.co/Snowflake/snowflake-arctic-embed-m-v2.0/resolve/main/tokenizer.json

The download is initiated in two scenarios:

  1. Automatic (workspace open): When the app opens a workspace, the embedding pipeline initialization checks for model files. If missing and semantic_search_enabled is true, the download begins in the background.
  2. Manual (settings toggle): When the user enables semantic search via the SemanticSearchToggle component, the frontend calls triggerModelDownload. The backend no-ops if the model is already present.

The semantic_search_enabled field in Settings (schema v11) controls whether semantic search is active:

  • Default: true (opt-out design)
  • When true: Model is downloaded if missing; SearchRouter uses both FTS5 and semantic search with RRF merge
  • When false: No download attempt; search falls back to FTS5-only

Users toggle this via the SemanticSearchToggle in workspace settings, which displays the model size (~585 MB) as informed consent before download.

The tokenizer is downloaded first because it is small (~2 MB), giving the user immediate visual feedback that the download has started before the large model transfer begins.

Download progress is emitted on the embedding-model:download-progress Tauri event channel. Events are throttled to at most once per 256 KB to avoid flooding the event bus.

FieldTypeDescription
statusstring"downloading", "verifying", "complete", "error", "cancelled"
filestringCurrent file: "model.onnx" or "tokenizer.json"
bytes_downloadedu64Bytes downloaded so far for the current file
total_bytesu64Total bytes from Content-Length (0 if unknown)
percentf64Percentage complete (0.0 — 100.0)
bytes_per_secondf64Current download speed
eta_secondsf64Estimated seconds remaining

The download accepts a CancellationToken (from tokio_util). Cancellation is checked inside the download loop via biased tokio::select! — the cancellation branch has priority over new data chunks.

On cancellation:

  1. A status: "cancelled" event is emitted
  2. The .tmp file is left on disk (not cleaned up by the downloader itself)
  3. ModelDownloadError::Cancelled is returned

The frontend exposes cancellation via the persistent toast’s cancel button, which calls cancelModelDownload.

SHA-256 verification is streaming — the hash is computed incrementally as bytes are written to the .tmp file. There is no separate verification pass after download completes. This means:

  1. Each chunk is written to disk AND fed to the Sha256 hasher
  2. After the stream completes, hasher.finalize() produces the digest
  3. The digest is compared against the embedded constant
  4. On match: .tmp is atomically renamed to the final filename
  5. On mismatch: .tmp is deleted, ChecksumMismatch error is returned

If a file already exists on disk, its SHA-256 is verified before skipping. If the existing file’s checksum does not match, it is deleted and re-downloaded.

The ModelDownloadError enum covers all failure modes:

VariantTrigger
HttpNetwork failure, non-2xx response, stream read error
IoFilesystem errors (create dir, write file, rename)
ChecksumMismatchSHA-256 of downloaded file does not match expected
CancelledCancellationToken was triggered during download
OversizedResponseContent-Length exceeds 800 MB safety limit

All error variants emit a status: "error" event before returning, so the frontend can clean up the progress toast and show an error message.

Model files are stored in {storage_dir}/models/, resolved by AppPaths::embedding_model_dir():

EnvironmentModel Directory
Dev{project}/.data/models/
Production~/Library/Application Support/Inklings/models/ (macOS)

The directory is created on first download via create_dir_all. Two files are expected:

models/
model.onnx # ~585 MB, ONNX FP16 embedding model
tokenizer.json # ~2 MB, tokenizer vocabulary

model_files_present() checks for both files by existence only (not checksum). The download flow performs full SHA-256 verification separately.

Located at apps/desktop/src-react/hooks/useModelDownload.ts. Mounted in App.tsx for the lifetime of the application.

  • Listens to embedding-model:download-progress events
  • Manages a persistent toast with progress bar, speed, and ETA
  • Provides a cancel button that calls cancelModelDownload
  • On completion, shows a transient “Semantic search ready” toast
  • On embedding-model:ready event, calls reloadEmbeddingPipeline to hot-swap the search pipeline

Located at apps/desktop/src-react/components/settings/SemanticSearchToggle.tsx. Rendered in workspace settings.

  • Reads initial state from getSettings
  • Toggle ON: calls setSemanticSearchEnabled(true) then triggerModelDownload
  • Toggle OFF: calls setSemanticSearchEnabled(false) — no further action
  • Shows model size (~585 MB) in the description as informed consent

Downloads are atomic: if interrupted, the .tmp file is abandoned and the download restarts from scratch on next trigger. Resume (HTTP Range headers) adds complexity for a one-time download that typically completes in minutes.

The tokenizer (~2 MB) downloads first, giving immediate visual feedback in the progress toast before the long model download begins. This avoids a perceived hang at the start.

A hardcoded 800 MB maximum (MAX_MODEL_SIZE_BYTES) rejects any response larger than expected. This guards against serving a corrupted or unexpected file from the CDN. The current model is ~585 MB, leaving headroom for minor size changes across model versions.

Computing the hash during download (not as a separate pass) halves the I/O for the ~585 MB model file. The hash is finalized in memory immediately after the stream completes.

tools/dev/download-embedding-model.sh pre-downloads the model for CI and development environments. This avoids the auto-download path when running tests or building in environments without interactive UX.

FileRole
apps/desktop/src-tauri/src/model_downloader.rsDownload logic, SHA-256 verification, progress events
apps/desktop/src-tauri/src/embedding.rsEmbeddingTask background worker that uses the downloaded model
apps/desktop/src-tauri/src/paths.rsembedding_model_dir() path resolution (dev vs production)
crates/domain/src/settings.rssemantic_search_enabled setting (schema v11)
apps/desktop/src-react/hooks/useModelDownload.tsProgress toast and cancellation UI
apps/desktop/src-react/components/settings/SemanticSearchToggle.tsxSettings toggle component
tools/dev/download-embedding-model.shDev/CI pre-download script

Was this page helpful?