Skip to content
GitHub stars

Configuration

Config File

Default location:

PlatformPath
macOS / Linux~/.msgvault/config.toml
WindowsC:\Users\<you>\.msgvault\config.toml

Override the data directory with the MSGVAULT_HOME environment variable or the --home flag (see below).

[data]
# Base data directory (default: ~/.msgvault)
data_dir = "/path/to/msgvault/data"
# Database URL (default: {data_dir}/msgvault.db)
database_url = "/path/to/msgvault.db"
[oauth]
# Path to Google OAuth client secrets JSON (required for Gmail)
client_secrets = "/path/to/client_secret.json"
# Named OAuth apps for Google Workspace orgs (optional)
[oauth.apps.acme]
client_secrets = "/path/to/acme_workspace_secret.json"
[microsoft]
# Azure AD app registration client ID (required for M365)
client_id = "your-azure-app-client-id"
# tenant_id = "your-tenant-id" # optional, default "common"
[log]
# Persistent structured file logging (opt-in)
enabled = true
# dir = "/path/to/logs" # default: <data_dir>/logs
# level = "info" # debug, info, warn, error
# sql_trace = false # log every SQL query (verbose)
# sql_slow_ms = 100 # slow query threshold in ms
[sync]
# Gmail API rate limit (requests per second)
rate_limit_qps = 5
[server]
# API server settings (used by `msgvault serve`)
api_port = 8080
bind_addr = "127.0.0.1"
api_key = "your-secret-key"
[remote]
# Remote msgvault endpoint for CLI remote mode
url = "http://nas-ip:8080"
api_key = "remote-api-key"
allow_insecure = true
# Scheduled sync accounts
[[accounts]]
email = "you@gmail.com"
schedule = "0 * * * *"
enabled = true
[vector]
# Semantic and hybrid search (opt-in; requires a build with sqlite_vec)
enabled = true
backend = "sqlite-vec"
[vector.embeddings]
endpoint = "http://localhost:11434/v1"
model = "nomic-embed-text"
dimension = 768

Windows Paths

TOML treats backslashes inside double-quoted strings as escape characters. On Windows, this means native paths like "C:\Users\you\..." will cause a parse error.

Use one of these formats instead:

# Forward slashes (recommended)
client_secrets = "C:/Users/you/Downloads/client_secret.json"
# Single-quoted string (backslashes are literal)
client_secrets = 'C:\Users\you\Downloads\client_secret.json'

Sections

[data]

KeyDefaultDescription
data_dir~/.msgvaultBase directory for all data
database_url{data_dir}/msgvault.dbSQLite database path

Attachments and OAuth tokens are stored in subdirectories of data_dir (attachments/ and tokens/ respectively). These paths are not independently configurable.

[oauth]

KeyDefaultDescription
client_secretsPath to Google OAuth client_secret.json (required for Gmail)

[oauth.apps.<name>]

Named OAuth apps for Google Workspace organizations that require their own OAuth credentials. Each entry defines a separate client_secret.json. Use --oauth-app <name> with add-account to bind an account to a named app.

KeyDefaultDescription
client_secretsPath to the org’s client_secret.json

See OAuth Setup: Google Workspace Accounts for when and why you need named apps.

[microsoft]

Configuration for Microsoft 365 / Outlook.com OAuth. Required only if you use add-o365.

KeyDefaultDescription
client_idAzure AD Application (client) ID (required)
tenant_idcommonAzure AD tenant ID; common allows both personal and org accounts

See OAuth Setup: Microsoft 365 for app registration steps.

[log]

Structured file logging. Disabled by default. Enable it to get persistent, machine-readable logs for troubleshooting. Every CLI invocation writes a unique run_id on every log line so you can trace a single run across shared daily log files.

KeyDefaultDescription
enabledfalseTurn on persistent file logging. Setting dir also enables it implicitly.
dir<data_dir>/logsDirectory for log files
levelinfoLog level: debug, info, warn, error
sql_tracefalseLog every SQL query at info level (verbose, for debugging)
sql_slow_ms100Threshold in ms above which SQL queries are logged at warn level. 0 uses the built-in default (100 ms).

Log files are named msgvault-YYYY-MM-DD.log (UTC date), written as newline-delimited JSON. When a daily log exceeds 50 MiB it rotates to .log.1, .log.2, etc. (up to 5 rotated files).

Use msgvault logs to view and tail log files. See CLI Reference: logs.

[sync]

KeyDefaultDescription
rate_limit_qps5Gmail API requests per second

[server]

Settings for the web server started by msgvault serve. See Web Server for endpoint documentation.

KeyDefaultDescription
api_port8080Port the server listens on
bind_addr127.0.0.1Bind address
api_keyAPI key for authentication
allow_insecurefalseAllow non-loopback binding without api_key
cors_origins[]Allowed CORS origins
cors_credentialsfalseAllow credentials in CORS requests
cors_max_age0CORS preflight cache duration in seconds

[remote]

When set, CLI commands read from the remote server by default for supported operations.

KeyDefaultDescription
urlRemote API base URL (e.g. http://nas-ip:8080)
api_keyAPI key used by remote commands
allow_insecurefalseAllow HTTP remote connections

Affected CLI commands: search, show-message, stats, list-accounts, tui.

[[accounts]]

Scheduled sync accounts for the web server. Each [[accounts]] entry defines a cron schedule for automatic background syncing.

KeyDefaultDescription
email(required)Gmail account email address
scheduleCron expression for sync schedule (e.g., 0 * * * *)
enabledtrueWhether scheduled sync is active for this account

[vector]

Top-level toggle and backend selection for semantic/hybrid search. Requires a build with sqlite_vec support (default via make build). See Vector Search for prerequisites, initial embedding, and the full workflow.

KeyDefaultDescription
enabledfalseTurn on vector and hybrid search. When false, mode=vector and mode=hybrid return vector_not_enabled.
backendsqlite-vecVector backend. Only sqlite-vec is supported.
db_path<data_dir>/vectors.dbPath to the co-located vectors database.

[vector.embeddings]

External OpenAI-compatible embedding endpoint used to convert message text into vectors. msgvault does not host a model; it calls the endpoint you configure. Use a local or self-hosted endpoint (Ollama, llama.cpp server, LM Studio, etc.) when message text must stay on your machine or network. Hosted endpoints also work but receive the text being embedded.

KeyDefaultDescription
endpoint(required)HTTP(S) base URL for an OpenAI-compatible embeddings API. msgvault appends /embeddings (for example, set http://localhost:11434/v1, not .../embeddings).
model(required)Model name to pass in each request (e.g., nomic-embed-text).
dimension(required)Vector dimension. Must match the model’s output dimension.
api_key_envName of an environment variable containing the API key. Omit for anonymous endpoints.
batch_size32Messages per HTTP call.
timeout30sPer-request timeout.
max_retries3Retries per batch on transient failures.
max_input_chars32768Character cap per message before embedding. Set below your model’s context window (e.g., 2000 for Ollama’s default nomic-embed-text).

The pair (model, dimension) forms the index generation fingerprint. Changing either value triggers a stale-index error on the next query until you run msgvault build-embeddings --full-rebuild.

[vector.preprocess]

Controls text normalization before embedding.

KeyDefaultDescription
strip_quotestrueDrop quoted reply blocks (> ... lines, reply preambles) before embedding.
strip_signaturestrueDrop trailing signature blocks (content after -- ).

[vector.search]

Hybrid ranking parameters applied at query time.

KeyDefaultDescription
rrf_k60Reciprocal Rank Fusion constant. Higher values flatten score differences between signals.
k_per_signal100Candidate pool size drawn from each signal (BM25 or vector) before fusion.
subject_boost2.0Multiplier applied when a query term matches a message’s subject line.
max_page_size_hybrid50Hard cap on page_size for vector/hybrid responses. Set to 0 to disable clamping.

[vector.embed.schedule]

Optional background scheduling for the embed worker inside msgvault serve. Empty config disables scheduled embedding; you can still run msgvault build-embeddings by hand.

KeyDefaultDescription
cron5-field cron expression. Empty string disables the standalone cron.
run_after_syncfalseWhen true, an embed pass runs after every successful scheduled sync.

Overriding the Home Directory

By default, msgvault stores everything under ~/.msgvault (macOS/Linux) or C:\Users\<you>\.msgvault (Windows). To use a different location, you have two options:

--home flag (per-command):

Terminal window
msgvault sync --home /mnt/data/msgvault

MSGVAULT_HOME environment variable (persistent):

Terminal window
export MSGVAULT_HOME=/mnt/data/msgvault

Both options are equivalent: config.toml is loaded from the specified directory, and all data (database, tokens, attachments) is stored there. The --home flag takes priority over MSGVAULT_HOME.

Environment Variables

VariableDescription
MSGVAULT_HOMEBase directory for all data (default: ~/.msgvault)
MSGVAULT_REMOTE_URLRemote URL for export-token (flag > env > config)
MSGVAULT_REMOTE_API_KEYRemote API key for export-token (flag > env > config)

File Locations

All data lives under the msgvault home directory (~/.msgvault on macOS/Linux, C:\Users\<you>\.msgvault on Windows). The directory is created automatically on first use.

FileDescription
config.tomlConfiguration file
msgvault.dbSQLite database (system of record)
attachments/Content-addressed attachment files
tokens/OAuth tokens per account
logs/Structured log files (when file logging is enabled)
analytics/Parquet cache files for TUI