Vector Search
Semantic search finds messages by meaning, not just keyword overlap:
a query like “planning offsite agenda” can surface a message titled
“Q2 team kickoff” if the bodies discuss the same topic, even when
none of the query words appear in the result. msgvault builds that
capability on top of the default keyword (FTS5) search by sending
message text to an embedding endpoint you configure, then storing the
vectors locally in vectors.db alongside your archive.
When vector search is enabled, the search command, the HTTP
/api/v1/search endpoint, and the MCP search_messages tool all
accept mode=vector (pure semantic) and mode=hybrid (BM25 +
vector fused with Reciprocal Rank Fusion). A separate MCP tool,
find_similar_messages, returns nearest-neighbor messages for a
given seed. The vectors and archive stay local, but embedding work is
performed by the endpoint in your config. If that endpoint is hosted
by a third party, message text and semantic query text are sent there;
use a local or self-hosted endpoint when you need the workflow to stay
on your own machine or network.
Prerequisites
-
A running OpenAI-compatible embedding endpoint. msgvault does not host a model. Point it at a local, self-hosted, or hosted endpoint that you trust. Common local options include Ollama, llama.cpp’s
server, and LM Studio. The endpoint must acceptPOST /embeddingswith an OpenAI-style JSON body and return indexed data rows such as{"data": [{"index": 0, "embedding": [...]}]}. -
A build with
sqlite_vecsupport. The standardmake buildtarget already passes-tags "fts5 sqlite_vec". If you see errors mentioning “binary was built without -tags sqlite_vec”, rebuild viamake build(orgo build -tags "fts5 sqlite_vec"if you are invokinggo builddirectly).
Windows source builds
The sqlite-vec CGo binding needs sqlite3.h at compile time, and the
MinGW 15 toolchain needs two extra flags to link arrow-go/v18’s
helpers. The easiest path is powershell -File scripts/build.ps1,
which wires everything up automatically. To invoke go build
yourself from PowerShell:
C:\msys64\usr\bin\pacman.exe -S --noconfirm --needed mingw-w64-x86_64-sqlite3$env:CGO_ENABLED = "1"$env:CGO_CFLAGS = "-IC:/msys64/mingw64/include -fgnu89-inline"$env:CGO_LDFLAGS = "-Wl,--allow-multiple-definition"go build -tags "fts5 sqlite_vec" -o msgvault.exe ./cmd/msgvaultEnable
Add a [vector] block to ~/.msgvault/config.toml:
[vector]enabled = truebackend = "sqlite-vec"# db_path defaults to <data_dir>/vectors.db when empty.# db_path = "/path/to/vectors.db"
[vector.embeddings]endpoint = "http://tailnet-host:11434/v1"api_key_env = "OLLAMA_API_KEY" # optional; omit for anonymous endpointsmodel = "nomic-embed-text"dimension = 768batch_size = 32 # embeddings per HTTP calltimeout = "30s"max_retries = 3max_input_chars = 2000 # see sizing guidance below
[vector.preprocess]strip_quotes = true # drop quoted reply blocks before embeddingstrip_signatures = true # drop common `-- ` signature blocks
[vector.search]rrf_k = 60 # RRF constant; higher flattens score differencesk_per_signal = 100 # candidate pool size per signal (BM25 or vector)subject_boost = 2.0 # score boost when a query term hits the subjectmax_page_size_hybrid = 50 # hard cap on vector/hybrid page_size
[vector.embed.schedule]cron = "*/5 * * * *" # embed worker cron (5-field); empty disables cronrun_after_sync = true # run a pass after every successful scheduled syncThe [vector] section only takes effect when enabled = true and
the binary was built with sqlite_vec. If either is missing, msgvault
behaves as before. Disabled vector search returns vector_not_enabled
from server surfaces; a binary built without sqlite_vec reports a
rebuild-with-sqlite-vec error when vector features are requested.
Matching max_input_chars to your embedder’s context window
max_input_chars is an upper bound in characters; the embedder
converts this to tokens on its own. Set it below the embedder’s
maximum context or full-length messages can fail with HTTP 400
during msgvault build-embeddings.
Practical guidance:
- 2k-token embedding models: start around
max_input_chars = 2000and raise only after confirming the endpoint accepts longer inputs. - 8k-token embedding models: start around
max_input_chars = 24000. - Self-hosted models: match the actual context window exposed by your server, not just the upstream model card.
If msgvault build-embeddings fails with repeated HTTP 400 warnings, check
the embedder’s logs. the input length exceeds the context length
confirms you need to lower max_input_chars.
Initial Embedding
Once vector search is enabled and your archive has synced or imported messages, embed it:
msgvault build-embeddings --full-rebuild --yesThis creates a new building generation, seeds the pending queue
with every non-deleted message in your archive, drains the queue in
batches through your configured embedder, and atomically activates
the generation once every pending row has been embedded. During the
first build, when no active generation exists yet, HTTP and MCP
vector/hybrid search return index_building; use mode=fts for the
interim.
The initial embed is the largest and longest operation. Runtime is roughly proportional to archive size divided by embedding throughput.
Keeping the Index Up to Date
After the initial rebuild, new messages arriving via email sync need to be embedded as well. msgvault handles this in two ways depending on how you run it.
CLI workflow (manual syncs)
If you run msgvault sync-full or msgvault sync (alias:
sync-incremental) by hand, new Gmail and IMAP messages are
auto-enqueued into every
non-retired generation during the sync. In steady state that means the
active generation; during a rebuild it means both the old active
generation and the new building generation. Run
msgvault build-embeddings (no --full-rebuild) to drain the queue:
# Sync new messages (auto-enqueues them for embedding)msgvault sync you@gmail.com
# Drain the embedding queue into the active generationmsgvault build-embeddingsmsgvault build-embeddings without --full-rebuild is a short, incremental
operation: it picks up the configured active generation, drains any
pending rows, and exits. You can schedule it via cron, run it after
every sync, or chain it (sync && build-embeddings).
Daemon workflow (msgvault serve)
In daemon mode the scheduler can run both pieces automatically. The
[vector.embed.schedule] section controls the embed worker
independently from the sync scheduler:
[vector.embed.schedule]cron = "*/5 * * * *" # run every 5 minutesrun_after_sync = true # and opportunistically after every scheduled syncWith run_after_sync = true, every successful scheduled sync
triggers an immediate embed pass against the queue it just
populated. The standalone cron ensures the queue drains even when
syncs are quiet (e.g. overnight). An empty cron = "" disables the
standalone schedule (useful if you only want the post-sync
trigger).
What auto-enqueues
| Ingest path | Auto-enqueues? |
|---|---|
sync-full / sync (Gmail, IMAP) | Yes |
Scheduled syncs in msgvault serve | Yes |
import-emlx (Apple Mail backup) | No. Re-run --full-rebuild after large imports |
import-mbox / import (mbox, eml) | No. Re-run --full-rebuild after large imports |
| Chat imports (iMessage, WhatsApp, Google Voice) | No. Run a full rebuild after importing if you want chats included |
For ingest paths that do not auto-enqueue, running
msgvault build-embeddings --full-rebuild --yes rebuilds the index over the
full archive including the newly-imported messages. A same-model full
rebuild is atomic from the searcher’s perspective: vector and hybrid
queries keep answering from the previous active generation until the
new one is ready. If the rebuild changes the configured model or
dimension, vector and hybrid queries return index_stale until the
new generation activates.
Search
CLI:
msgvault search "planning offsite agenda" --mode hybridmsgvault search "planning offsite agenda" --mode vector --explainmsgvault search "..." --json --mode hybrid # JSON output with scoresCLI vector and hybrid modes run against the local archive. If
[remote].url is configured, msgvault search --mode vector|hybrid
is rejected; call the remote server’s HTTP /api/v1/search endpoint
directly for remote vector search.
HTTP:
curl "http://localhost:8080/api/v1/search?q=planning+offsite&mode=hybrid"curl "http://localhost:8080/api/v1/search?q=planning+offsite&mode=vector&explain=1"Response shape differs from the FTS path; see the
Web Server reference for details.
Pagination is not supported for vector/hybrid responses; bump
page_size (capped at max_page_size_hybrid) instead.
mode=vector and mode=hybrid require at least one free-text term:
the free text is what gets embedded as the query vector. A query
that is purely operators (e.g. from:alice label:IMPORTANT) is
rejected; HTTP and MCP return missing_free_text. Use mode=fts for
those.
MCP tools:
search_messagesacceptsmode(fts/vector/hybrid) andexplainarguments.find_similar_messagestakes a seedmessage_idand returns nearest neighbors (excluding the seed itself). Optionalaccount,after,before,has_attachmentfilters.
Model Rotation
To switch models or dimensions, update [vector.embeddings].model
and/or .dimension in your config, then run:
msgvault build-embeddings --full-rebuild --yesThis builds a new generation with the new fingerprint and activates
it atomically when the build completes. While the rebuild is in
flight, mode=vector and mode=hybrid return index_stale (the
previously-active generation no longer matches the configured
fingerprint, so search refuses to serve potentially-mismatched
results). Use mode=fts until the new generation activates; it
does not depend on the vector index. Once msgvault build-embeddings reports
the new generation activated, vector and hybrid modes resume.
Troubleshooting
Common HTTP/MCP error codes and fixes. The CLI reports equivalent conditions as command errors rather than structured codes.
| Error | Meaning | Recovery |
|---|---|---|
vector_not_enabled | The server or MCP process did not wire a vector backend, usually because [vector] enabled = false. | Set enabled = true, configure [vector.embeddings], and start with a sqlite_vec build. |
index_stale | Active generation’s model/dimension doesn’t match the configured [vector.embeddings] fingerprint. | Run msgvault build-embeddings --full-rebuild --yes. |
index_building | No active generation yet; one is being built. | Finish running msgvault build-embeddings or wait for the scheduler. Use mode=fts for the interim. |
missing_free_text | mode=vector or mode=hybrid used with a filter-only query (no free text to embed). | Add free-text terms to q, or switch to mode=fts. |
pagination_unsupported | Request asked for page>1 with `mode=vector | hybrid`. |
invalid_mode | mode= value other than fts, vector, hybrid. | Pick one of those. |
embedding_timeout | The embedding endpoint did not respond before the request deadline (transient: slow/cold model, network blip). | Retry; if persistent, raise [vector.embeddings].timeout or use a faster endpoint. |
msgvault build-embeddings repeatedly logs embed batch failed ... HTTP 400
and aborts after 5 consecutive failures: check the embedder’s logs.
If they say the input length exceeds the context length (Ollama)
or an equivalent token-limit error, lower max_input_chars to match
the model’s context window. See the sizing guidance above.
To confirm the binary was built with vector support:
msgvault search "probe" --mode vectorA clear “rebuild with sqlite_vec” error indicates the tag is missing.
A different error (vector_not_enabled, index_stale, etc.) means
the command moved past the build-tag check and is now waiting on
config or backfill.
Check index health via the stats endpoint:
curl -H "X-API-Key: ..." http://localhost:8080/api/v1/stats | jq .vector_searchThe active_generation.message_count should roughly match
total_messages. pending_embeddings_total shows how many rows
still need embedding (either because a rebuild is in flight or
because recent syncs have not yet been drained).
What Gets Embedded
The embedder processes one vector per message. Per-message input is
assembled from subject and body_text after preprocessing
(configurable in [vector.preprocess]):
- Optional stripping of quoted-reply blocks (
> ...lines and common reply-preamble markers). - Optional stripping of trailing signatures (lines after
--). - Truncation at
max_input_charsat a UTF-8 rune boundary.
Messages deleted at the source (deleted_from_source_at IS NOT NULL)
are skipped entirely. Messages without a body_text fall back to
HTML-to-text conversion of body_html so HTML-only messages still
contribute full-body embeddings. Messages with neither body field use
the subject only; if the subject is also empty, the embedder receives
an empty string for that row.
See Also
- Web Server: HTTP API reference (search, stats).
- Searching: Full-text search syntax.