Attachments — File Upload, Storage & Lifecycle
Attachments — File Upload, Storage & Lifecycle
Covers the complete attachment lifecycle: uploading files by path and by raw bytes, retrieving metadata and file content, listing attachments scoped to a page, deleting attachments, content-addressable deduplication via SHA-256, extension allowlist enforcement, storage quota limits, and cleanup of on-disk files and image variant caches. This spec is P2 because attachments are the primary mechanism for embedding non-text content into a workspace — breakage here prevents users from attaching images, PDFs, and documents to their pages.
The attachment system enforces an allowlist of file extensions at the domain layer (ALLOWED_EXTENSIONS), a 10 MB
per-file quota and 100 MB workspace quota at the use-case layer, and content-addressable deduplication so identical
files share one on-disk copy. Files are stored at {workspace}/attachments/{uuid}.{ext}. Metadata is persisted in
SQLite. Sync state begins as local_only and is updated asynchronously.
Preconditions
- HTTP bridge running on port 9990
- A workspace initialized via
initialize_workspacebefore each scenario - Bridge shim injected via
playwright.config.ts - The HTTP bridge exposes all attachment routes:
upload_attachment,upload_attachment_bytes,get_attachment,get_attachment_file,list_attachments, anddelete_attachment. All scenarios in this spec are exercisable via the bridge.
Scenarios
Seed: seed.spec.ts
1. Upload an attachment by file path
upload_attachment reads a file from a given path, computes its SHA-256 hash, writes it to the workspace attachments
directory, and returns the attachment metadata.
Steps:
- Prepare a small PNG file on disk at a known path (e.g., a 1 KB test image).
- Create a page titled “Image Owner”.
- Call
upload_attachmentwith the file path pointing to the PNG.
Expected: The response is an Attachment object with original_filename: "test.png", file_extension: "png",
content_type matching PNG, size_bytes > 0, a non-nil id (UUID), and sync_status: "local_only". A file exists at
{workspace}/attachments/{id}.png on disk.
2. Upload an attachment via raw bytes
upload_attachment_bytes accepts raw file content directly, bypassing the filesystem path. Used for clipboard paste and
drag-drop.
Steps:
- Create a page titled “Bytes Upload”.
- Call
upload_attachment_byteswithfile_name: "pasted.png"and non-empty byte array representing a minimal PNG.
Expected: The response is an Attachment object with original_filename: "pasted.png", file_extension: "png",
and size_bytes equal to the byte array length. The attachment file exists on disk.
3. Upload with empty file name is rejected
Providing an empty file_name is caught as a validation error before any disk I/O.
Steps:
- Call
upload_attachment_byteswithfile_name: ""and non-empty bytes.
Expected: A validation error is returned (no attachment is created). The error message indicates the file name cannot be empty.
4. Upload with empty bytes is rejected
Providing a zero-length byte array is caught as a validation error.
Steps:
- Call
upload_attachment_byteswithfile_name: "empty.png"and an empty byte array ([]).
Expected: A validation error is returned. The error message indicates the file must be non-empty (or “greater than zero” for size). No attachment metadata is saved.
5. Upload with a disallowed file extension is rejected
The domain allowlist (ALLOWED_EXTENSIONS) rejects unknown file types at validation time.
Steps:
- Call
upload_attachment_byteswithfile_name: "virus.exe"and non-empty bytes.
Expected: A validation error is returned. The error message indicates the .exe extension is not allowed. No file
is written to disk and no metadata is saved.
6. Upload with an allowed extension succeeds
All extensions in the domain allowlist are accepted.
Steps:
- For each of the following extensions:
png,jpg,pdf,docx,csv, upload a minimal byte array withfile_namematching the extension (e.g.,"doc.pdf","data.csv").
Expected: Each upload returns an Attachment object with the correct file_extension. No validation errors occur.
At least png, pdf, and csv succeed.
7. SHA-256 deduplication — same content returns the same attachment
Uploading the same file content twice does not create a second on-disk copy. The existing attachment is returned.
Steps:
- Call
upload_attachment_byteswithfile_name: "first.png"and a fixed byte array (e.g.,[0x89, 0x50, 0x4E, 0x47, ...]). - Record the returned
idfrom the first upload. - Call
upload_attachment_bytesagain withfile_name: "second.png"and the same byte array. - Record the returned
idfrom the second upload.
Expected: Both calls succeed and return the same id. Only one file exists at {workspace}/attachments/{id}.png.
The original_filename field reflects whichever upload was first (not “second.png”). This confirms content-addressable
deduplication is working.
8. Get attachment metadata by ID
get_attachment returns the stored metadata for an uploaded attachment.
Steps:
- Upload a file and record its
id. - Call
get_attachmentwith theid.
Expected: The response matches the metadata returned by the upload: same id, original_filename,
file_extension, content_type, size_bytes, and content_hash. The created_at and updated_at timestamps are
present and non-empty.
9. Get attachment metadata for non-existent ID returns not-found
Requesting metadata for an ID that was never uploaded returns an appropriate error.
Steps:
- Generate a random UUID (do not upload any file with this ID).
- Call
get_attachmentwith the random UUID.
Expected: A “not found” error is returned. No crash or internal server error occurs.
10. Get attachment file content
get_attachment_file returns the raw bytes of an uploaded file.
Steps:
- Upload a file with known content (e.g., a byte array containing
"test content"). - Record the
id. - Call
get_attachment_filewith theid.
Expected: The response body contains the original file bytes exactly. The Content-Type header matches the
attachment’s MIME type (e.g., image/png). The byte count equals the uploaded size_bytes.
11. List all attachments in the workspace
list_attachments without a page_id filter returns all attachments in the workspace.
Steps:
- Upload three files:
"alpha.png","beta.pdf","gamma.txt". - Call
list_attachmentswith nopage_idargument.
Expected: The response is an array of at least 3 Attachment objects. Each uploaded file appears in the list. The
list includes correct metadata for each entry.
12. List attachments scoped to a page
list_attachments filtered by page_id returns only attachments referenced by that page.
Steps:
- Create two pages: “Page Alpha” and “Page Beta”.
- Upload
"alpha.png"and associate it with “Page Alpha” (viaupsert_referencesor by embedding in the page content). - Upload
"beta.png"and associate it with “Page Beta”. - Call
list_attachmentswith thepage_idof “Page Alpha”.
Expected: The response contains only "alpha.png". "beta.png" does not appear. The list is correctly scoped to
the specified page.
13. Delete an attachment — metadata and file removed
delete_attachment removes both the SQLite metadata row and the on-disk file.
Steps:
- Upload a file and record its
idandfile_extension. - Verify the file exists at
{workspace}/attachments/{id}.{ext}. - Call
delete_attachmentwith theid. - Call
get_attachmentwith the sameid.
Expected: The delete succeeds with no error. The subsequent get_attachment call returns a “not found” error. The
on-disk file no longer exists at its original path.
14. Delete an attachment with active page references is blocked
delete_attachment (single delete) is blocked when a page still references the attachment, to prevent dangling
references.
Steps:
- Upload a file and record its
id. - Associate the attachment with a page via
upsert_references. - Call
delete_attachmentwith theid.
Expected: The delete returns a validation error indicating the attachment is “referenced by” one or more pages. The
attachment metadata and file on disk are unchanged. get_attachment still succeeds.
15. Per-file size limit enforcement
Files exceeding the 10 MB per-file limit (use-case layer) or 16 MB limit (Tauri command layer for bytes upload) are rejected.
Steps:
- Construct a byte array larger than 10 MB (e.g., 11 * 1024 * 1024 bytes of zeros) with
file_name: "huge.pdf". - Call
upload_attachment_byteswith this oversized byte array.
Expected: A quota or validation error is returned. The error message mentions the file size limit (either “per-file limit” or “exceeds maximum”). No file is written to disk.
16. Attachment metadata includes all expected fields
The Attachment struct returned from any upload command includes all required metadata fields.
Steps:
- Upload a file
"metadata-check.png"with a minimal PNG byte array. - Inspect all fields of the returned
Attachment.
Expected: The returned object contains: id (non-nil UUID), original_filename ("metadata-check.png"),
file_extension ("png"), content_type (Png variant), size_bytes (> 0), content_hash (64-character hex SHA-256
string), created_at (ISO timestamp), updated_at (ISO timestamp), and sync_status ("local_only").
17. Attachment persists after page navigation
Uploaded attachments are not lost when the user navigates away from the page and returns.
Steps:
- Upload a file and record its
id. - Navigate to a different page.
- Navigate back to the original page.
- Call
get_attachmentwith the recordedid.
Expected: The attachment is still retrievable with all original metadata intact. No data was lost during navigation.
Test Data
| Key | Value | Notes |
|---|---|---|
| test_png_name | test.png | Filename for basic upload scenarios |
| test_png_bytes | minimal PNG magic bytes | Smallest valid PNG (26 bytes minimum) |
| disallowed_extension | exe | Must be rejected by allowlist |
| allowed_extensions | png, jpg, pdf, docx, csv | Sampling of allowed extensions for coverage |
| dedup_content | fixed byte array (same both uploads) | Identical bytes to trigger SHA-256 dedup |
| per_file_limit_bytes | 10 * 1024 * 1024 (10 MB) | Use-case level limit |
| tauri_bytes_limit | 16 * 1024 * 1024 (16 MB) | Tauri command layer limit for upload_attachment_bytes |
| storage_path_pattern | {workspace}/attachments/{uuid}.{ext} | On-disk file naming convention |
| cache_thumb_pattern | {workspace}/attachments/.cache/{uuid}_thumb.webp | Thumbnail cache path cleaned on delete |
| cache_display_pattern | {workspace}/attachments/.cache/{uuid}_display.webp | Display cache path cleaned on delete |
| default_sync_status | local_only | Initial sync status for all uploaded attachments |
Notes
- The HTTP bridge router does not currently expose attachment commands (
upload_attachment,get_attachment,get_attachment_file,list_attachments,delete_attachment). These are Tauri IPC commands invoked from the React frontend. E2E tests for these scenarios must run against the full desktop app rather than the bridge. upload_attachmentvalidates that the file path exists and is a regular file before reading. Path traversal characters are rejected byvalidate_ipc_pathat the Tauri command layer.- SHA-256 deduplication is content-addressable: the
get_by_hashlookup runs before file write and quota checks. If a hash match is found, the existingAttachmentis returned immediately without writing a new file. delete_attachment(single) is blocked by active page references.BulkDeleteAttachmentsUseCasewithforce: trueoverrides this guard. Single-delete tests should verify that references are cleared before expecting a successful delete.- When an attachment is deleted, both the primary file (
{uuid}.{ext}) and any cached image variants ({uuid}_thumb.webp,{uuid}_display.webpin the.cache/subdirectory) are removed. - The domain validation runs at creation time via
Attachment::new. Infrastructure reads useAttachment::from_parts(bypasses validation). This means a file stored with an extension that later becomes disallowed will still be readable. - Magic-byte validation applies to image content types only. If the file extension says PNG but the bytes are JPEG magic, the upload is rejected. Non-image files (PDFs, office docs) only log a warning on mismatch.
- The
content_hashfield is a lowercase SHA-256 hex digest (64 characters). Tests checking this field should verify the length and character class, not an exact value.
Was this page helpful?
Thanks for your feedback!