Configuration
Config File
Default location:
| Platform | Path |
|---|---|
| macOS / Linux | ~/.msgvault/config.toml |
| Windows | C:\Users\<you>\.msgvault\config.toml |
Override the data directory with the MSGVAULT_HOME environment variable or the --home flag (see below).
[data]# Base data directory (default: ~/.msgvault)data_dir = "/path/to/msgvault/data"
# Database URL (default: {data_dir}/msgvault.db)database_url = "/path/to/msgvault.db"
[oauth]# Path to Google OAuth client secrets JSON (required for Gmail)client_secrets = "/path/to/client_secret.json"
# Named OAuth apps for Google Workspace orgs (optional)[oauth.apps.acme]client_secrets = "/path/to/acme_workspace_secret.json"
[microsoft]# Azure AD app registration client ID (required for M365)client_id = "your-azure-app-client-id"# tenant_id = "your-tenant-id" # optional, default "common"
[log]# Persistent structured file logging (opt-in)enabled = true# dir = "/path/to/logs" # default: <data_dir>/logs# level = "info" # debug, info, warn, error# sql_trace = false # log every SQL query (verbose)# sql_slow_ms = 100 # slow query threshold in ms
[sync]# Gmail API rate limit (requests per second)rate_limit_qps = 5
[server]# API server settings (used by `msgvault serve`)api_port = 8080bind_addr = "127.0.0.1"api_key = "your-secret-key"
[remote]# Remote msgvault endpoint for CLI remote modeurl = "http://nas-ip:8080"api_key = "remote-api-key"allow_insecure = true
# Scheduled sync accounts[[accounts]]email = "you@gmail.com"schedule = "0 * * * *"enabled = true
[vector]# Semantic and hybrid search (opt-in; requires a build with sqlite_vec)enabled = truebackend = "sqlite-vec"
[vector.embeddings]endpoint = "http://localhost:11434/v1"model = "nomic-embed-text"dimension = 768Windows Paths
TOML treats backslashes inside double-quoted strings as escape characters. On Windows, this means native paths like "C:\Users\you\..." will cause a parse error.
Use one of these formats instead:
# Forward slashes (recommended)client_secrets = "C:/Users/you/Downloads/client_secret.json"
# Single-quoted string (backslashes are literal)client_secrets = 'C:\Users\you\Downloads\client_secret.json'Sections
[data]
| Key | Default | Description |
|---|---|---|
data_dir | ~/.msgvault | Base directory for all data |
database_url | {data_dir}/msgvault.db | SQLite database path |
Attachments and OAuth tokens are stored in subdirectories of data_dir (attachments/ and tokens/ respectively). These paths are not independently configurable.
[oauth]
| Key | Default | Description |
|---|---|---|
client_secrets | — | Path to Google OAuth client_secret.json (required for Gmail) |
[oauth.apps.<name>]
Named OAuth apps for Google Workspace organizations that require their own OAuth credentials. Each entry defines a separate client_secret.json. Use --oauth-app <name> with add-account to bind an account to a named app.
| Key | Default | Description |
|---|---|---|
client_secrets | — | Path to the org’s client_secret.json |
See OAuth Setup: Google Workspace Accounts for when and why you need named apps.
[microsoft]
Configuration for Microsoft 365 / Outlook.com OAuth. Required only if you use add-o365.
| Key | Default | Description |
|---|---|---|
client_id | — | Azure AD Application (client) ID (required) |
tenant_id | common | Azure AD tenant ID; common allows both personal and org accounts |
See OAuth Setup: Microsoft 365 for app registration steps.
[log]
Structured file logging. Disabled by default. Enable it to get persistent, machine-readable logs for troubleshooting. Every CLI invocation writes a unique run_id on every log line so you can trace a single run across shared daily log files.
| Key | Default | Description |
|---|---|---|
enabled | false | Turn on persistent file logging. Setting dir also enables it implicitly. |
dir | <data_dir>/logs | Directory for log files |
level | info | Log level: debug, info, warn, error |
sql_trace | false | Log every SQL query at info level (verbose, for debugging) |
sql_slow_ms | 100 | Threshold in ms above which SQL queries are logged at warn level. 0 uses the built-in default (100 ms). |
Log files are named msgvault-YYYY-MM-DD.log (UTC date), written as newline-delimited JSON. When a daily log exceeds 50 MiB it rotates to .log.1, .log.2, etc. (up to 5 rotated files).
Use msgvault logs to view and tail log files. See CLI Reference: logs.
[sync]
| Key | Default | Description |
|---|---|---|
rate_limit_qps | 5 | Gmail API requests per second |
[server]
Settings for the web server started by msgvault serve. See Web Server for endpoint documentation.
| Key | Default | Description |
|---|---|---|
api_port | 8080 | Port the server listens on |
bind_addr | 127.0.0.1 | Bind address |
api_key | — | API key for authentication |
allow_insecure | false | Allow non-loopback binding without api_key |
cors_origins | [] | Allowed CORS origins |
cors_credentials | false | Allow credentials in CORS requests |
cors_max_age | 0 | CORS preflight cache duration in seconds |
[remote]
When set, CLI commands read from the remote server by default for supported operations.
| Key | Default | Description |
|---|---|---|
url | — | Remote API base URL (e.g. http://nas-ip:8080) |
api_key | — | API key used by remote commands |
allow_insecure | false | Allow HTTP remote connections |
Affected CLI commands: search, show-message, stats, list-accounts, tui.
[[accounts]]
Scheduled sync accounts for the web server. Each [[accounts]] entry defines a cron schedule for automatic background syncing.
| Key | Default | Description |
|---|---|---|
email | (required) | Gmail account email address |
schedule | — | Cron expression for sync schedule (e.g., 0 * * * *) |
enabled | true | Whether scheduled sync is active for this account |
[vector]
Top-level toggle and backend selection for semantic/hybrid search. Requires a build with sqlite_vec support (default via make build). See Vector Search for prerequisites, initial embedding, and the full workflow.
| Key | Default | Description |
|---|---|---|
enabled | false | Turn on vector and hybrid search. When false, mode=vector and mode=hybrid return vector_not_enabled. |
backend | sqlite-vec | Vector backend. Only sqlite-vec is supported. |
db_path | <data_dir>/vectors.db | Path to the co-located vectors database. |
[vector.embeddings]
External OpenAI-compatible embedding endpoint used to convert message text into vectors. msgvault does not host a model; it calls the endpoint you configure. Use a local or self-hosted endpoint (Ollama, llama.cpp server, LM Studio, etc.) when message text must stay on your machine or network. Hosted endpoints also work but receive the text being embedded.
| Key | Default | Description |
|---|---|---|
endpoint | (required) | HTTP(S) base URL for an OpenAI-compatible embeddings API. msgvault appends /embeddings (for example, set http://localhost:11434/v1, not .../embeddings). |
model | (required) | Model name to pass in each request (e.g., nomic-embed-text). |
dimension | (required) | Vector dimension. Must match the model’s output dimension. |
api_key_env | — | Name of an environment variable containing the API key. Omit for anonymous endpoints. |
batch_size | 32 | Messages per HTTP call. |
timeout | 30s | Per-request timeout. |
max_retries | 3 | Retries per batch on transient failures. |
max_input_chars | 32768 | Character cap per message before embedding. Set below your model’s context window (e.g., 2000 for Ollama’s default nomic-embed-text). |
The pair (model, dimension) forms the index generation fingerprint. Changing either value triggers a stale-index error on the next query until you run msgvault build-embeddings --full-rebuild.
[vector.preprocess]
Controls text normalization before embedding.
| Key | Default | Description |
|---|---|---|
strip_quotes | true | Drop quoted reply blocks (> ... lines, reply preambles) before embedding. |
strip_signatures | true | Drop trailing signature blocks (content after -- ). |
[vector.search]
Hybrid ranking parameters applied at query time.
| Key | Default | Description |
|---|---|---|
rrf_k | 60 | Reciprocal Rank Fusion constant. Higher values flatten score differences between signals. |
k_per_signal | 100 | Candidate pool size drawn from each signal (BM25 or vector) before fusion. |
subject_boost | 2.0 | Multiplier applied when a query term matches a message’s subject line. |
max_page_size_hybrid | 50 | Hard cap on page_size for vector/hybrid responses. Set to 0 to disable clamping. |
[vector.embed.schedule]
Optional background scheduling for the embed worker inside msgvault serve. Empty config disables scheduled embedding; you can still run msgvault build-embeddings by hand.
| Key | Default | Description |
|---|---|---|
cron | — | 5-field cron expression. Empty string disables the standalone cron. |
run_after_sync | false | When true, an embed pass runs after every successful scheduled sync. |
Overriding the Home Directory
By default, msgvault stores everything under ~/.msgvault (macOS/Linux) or C:\Users\<you>\.msgvault (Windows). To use a different location, you have two options:
--home flag (per-command):
msgvault sync --home /mnt/data/msgvaultMSGVAULT_HOME environment variable (persistent):
export MSGVAULT_HOME=/mnt/data/msgvaultBoth options are equivalent: config.toml is loaded from the specified directory, and all data (database, tokens, attachments) is stored there. The --home flag takes priority over MSGVAULT_HOME.
Environment Variables
| Variable | Description |
|---|---|
MSGVAULT_HOME | Base directory for all data (default: ~/.msgvault) |
MSGVAULT_REMOTE_URL | Remote URL for export-token (flag > env > config) |
MSGVAULT_REMOTE_API_KEY | Remote API key for export-token (flag > env > config) |
File Locations
All data lives under the msgvault home directory (~/.msgvault on macOS/Linux, C:\Users\<you>\.msgvault on Windows). The directory is created automatically on first use.
| File | Description |
|---|---|
config.toml | Configuration file |
msgvault.db | SQLite database (system of record) |
attachments/ | Content-addressed attachment files |
tokens/ | OAuth tokens per account |
logs/ | Structured log files (when file logging is enabled) |
analytics/ | Parquet cache files for TUI |