Configuration reference¶

ZettelForge reads its configuration from config.yaml in your working or project directory. You do not need to create this file for storage, embeddings, and recall on the default SQLite backend. Set the LLM provider and model explicitly before you run LLM-dependent features such as extraction and synthesis.

cp config.default.yaml config.yaml
$EDITOR config.yaml

config.yaml is listed in .gitignore — safe to put credentials there without committing them.

Resolution order¶

ZettelForge resolves each setting from these sources, highest priority first:

Environment variables (ZETTELFORGE_*, TYPEDB_*, AMEM_*)
config.yaml in the working directory or project root
config.default.yaml in the project root
Hardcoded defaults in src/zettelforge/config.py

Quick reference¶

Section	What it controls
`storage`	Where ZettelForge writes data
`backend`	SQLite (default) or TypeDB graph store
`embedding`	Embedding provider and model
`llm`	LLM provider, model, token limits
`llm_ner`	Background named-entity recognition
`extraction`	Two-phase fact extraction pipeline
`retrieval`	Recall scoring and cross-encoder reranking
`synthesis`	RAG answer generation
`governance`	Validation, PII detection, operation limits, memory defense
`lance`	LanceDB version-history cleanup
`cache`	TypeDB query result cache
`logging`	Log level and verbosity
`web`	Web management interface
`opencti`	OpenCTI connection settings (in progress)

`storage`¶

Where ZettelForge stores notes, vector indexes, and entity indexes.

Key	Default	Description
`data_dir`	`~/.amem`	Root data directory. Expand `~` resolves to the current user's home directory.

Environment variable: AMEM_DATA_DIR

storage:
  data_dir: ~/.amem

`backend`¶

Which database backend ZettelForge uses for notes, knowledge graphs, and entity indexes.

Value	Description
`sqlite`	SQLite plus LanceDB. Default. Zero-dependency, no external services, ACID-safe. Runs fully offline.
`typedb`	TypeDB 3.x STIX knowledge graph. Requires a running TypeDB container and an additional extension beyond the default pip install.

Default: sqlite

Environment variable: ZETTELFORGE_BACKEND

backend: sqlite

For most users the SQLite backend is the right choice. If you need the TypeDB graph backend, see your extension's documentation. ThreatRecall.ai provides a hosted multi-tenant knowledge graph without running TypeDB yourself.

`typedb`¶

TypeDB connection settings. Only used when backend: typedb.

Key	Default	Description
`host`	`localhost`	TypeDB server hostname or IP.
`port`	`1729`	TypeDB port.
`database`	`zettelforge`	TypeDB database name.
`username`	`""`	TypeDB username. Use `${TYPEDB_USERNAME}` to pull from the environment.
`password`	`""`	TypeDB password. Use `${TYPEDB_PASSWORD}` to pull from the environment.

Environment variables: TYPEDB_HOST, TYPEDB_PORT, TYPEDB_DATABASE, TYPEDB_USERNAME, TYPEDB_PASSWORD

Never commit credentials to version control. Use environment references:

typedb:
  host: localhost
  port: 1729
  database: zettelforge
  username: ${TYPEDB_USERNAME}
  password: ${TYPEDB_PASSWORD}

`embedding`¶

Embedding model used for semantic vector search.

Key	Default	Description
`provider`	`fastembed`	`fastembed` (in-process ONNX) or `ollama` (HTTP server).
`url`	`http://127.0.0.1:11434`	Ollama server URL. Used only when `provider: ollama`.
`model`	`nomic-ai/nomic-embed-text-v1.5-Q`	Embedding model name.
`dimensions`	`768`	Output dimension. Must match the model's actual output. Changing the model requires updating this field and running `rebuild_index.py` to re-embed all notes.

Environment variables: ZETTELFORGE_EMBEDDING_PROVIDER, AMEM_EMBEDDING_URL, AMEM_EMBEDDING_MODEL

Common model dimensions:

Model	Dimensions
`nomic-ai/nomic-embed-text-v1.5-Q`	768
`nomic-ai/nomic-embed-text-v1.5`	768
`nomic-embed-text-v2-moe` (Ollama)	768
`qwen3-embedding` (Ollama)	4096

embedding:
  provider: fastembed
  model: nomic-ai/nomic-embed-text-v1.5-Q
  dimensions: 768

Dimension mismatch

If you change the embedding model, update dimensions to match and run rebuild_index.py. Mismatched dimensions cause empty search results without a clear error.

`llm`¶

Language model for fact extraction, intent classification, causal triple extraction, and RAG synthesis.

Key	Default	Description
`provider`	`ollama`	Backend provider: `ollama`, `local`, `litellm`, or `mock`.
`model`	`qwen3.5:9b`	Source default in v2.7.0. This value is unresolved upstream; set it explicitly to a model your provider can load, such as an installed Ollama tag or a local HuggingFace model ID.
`url`	`http://localhost:11434`	Provider URL. Used by `ollama` and `local` providers.
`api_key`	`""`	API key for cloud providers. Use `${ENV_VAR}` syntax — never commit raw keys.
`temperature`	`0.1`	Sampling temperature (0.0–1.0). Lower values are more deterministic.
`timeout`	`180.0`	Request timeout in seconds. Raised from 60s in v2.5.2 to accommodate reasoning models.
`max_retries`	`2`	Number of retry attempts on transient failure.
`fallback`	`""`	Name of a backup provider invoked on primary failure. Empty preserves the implicit `local → ollama` fallback chain.
`local_backend`	`llama-cpp-python`	In-process inference engine when `provider: local`. Options: `llama-cpp-python` (GGUF via llama-cpp-python, requires `pip install zettelforge[local]`) or `onnxruntime-genai` (ONNX, requires `pip install zettelforge[local-onnx]`).
`max_tokens`	`400`	Default output token limit.
`max_tokens_causal`	`8000`	Token limit for causal triple extraction.
`max_tokens_synthesis`	`2500`	Token limit for RAG synthesis.
`max_tokens_extraction`	`2500`	Token limit for fact extraction.
`max_tokens_ner`	`2500`	Token limit for LLM NER.
`max_tokens_evolve`	`2500`	Token limit for memory evolution.
`reasoning_model`	`false`	Set `true` when using a reasoning-optimized model (e.g. QwQ, DeepSeek-R1). Adjusts prompt framing to account for extended chain-of-thought output.
`extra`	`{}`	Provider-specific kwargs forwarded to the constructor. Example: `filename: qwen2.5-3b-instruct-q4_k_m.gguf` for the local provider.

Environment variables: ZETTELFORGE_LLM_PROVIDER, ZETTELFORGE_LLM_MODEL, ZETTELFORGE_LLM_URL, ZETTELFORGE_LLM_API_KEY, ZETTELFORGE_LLM_TIMEOUT, ZETTELFORGE_LLM_MAX_RETRIES, ZETTELFORGE_LLM_FALLBACK, ZETTELFORGE_LLM_LOCAL_BACKEND, ZETTELFORGE_LLM_MAX_TOKENS, ZETTELFORGE_LLM_MAX_TOKENS_CAUSAL, ZETTELFORGE_LLM_MAX_TOKENS_SYNTHESIS, ZETTELFORGE_LLM_MAX_TOKENS_EXTRACTION, ZETTELFORGE_LLM_MAX_TOKENS_NER, ZETTELFORGE_LLM_MAX_TOKENS_EVOLVE, ZETTELFORGE_LLM_REASONING_MODEL

Set the LLM model explicitly

The v2.7.0 source still contains qwen3.5:9b in LLMConfig and config.default.yaml. This docs set has not verified that string as a working Ollama tag. Use ZETTELFORGE_LLM_MODEL or llm.model to name a model available in your environment. The Ollama provider code falls back to qwen2.5:3b only when no model is passed to that provider, but the shared configuration currently passes the source default.

Provider examples¶

# Ollama on localhost with an explicit installed model
llm:
  provider: ollama
  model: qwen2.5:3b
  url: http://localhost:11434

# In-process GGUF — fully offline (requires zettelforge[local])
llm:
  provider: local
  local_backend: llama-cpp-python
  model: Qwen/Qwen2.5-3B-Instruct-GGUF
  extra:
    filename: qwen2.5-3b-instruct-q4_k_m.gguf

# LiteLLM routing — OpenAI, Anthropic, Google, Groq, and more
# (requires zettelforge[litellm])
llm:
  provider: litellm
  model: gpt-4o
  api_key: ${OPENAI_API_KEY}

# LiteLLM — Anthropic
llm:
  provider: litellm
  model: claude-sonnet-4-20250514
  api_key: ${ANTHROPIC_API_KEY}

# LiteLLM — Groq
llm:
  provider: litellm
  model: groq/llama-3.3-70b-versatile
  api_key: ${GROQ_API_KEY}

`llm_ner`¶

Background named-entity recognition. Runs after every remember() call to extract persons, locations, organizations, events, activities, and temporal entities using the configured LLM. Regex extraction always runs regardless of this setting.

Key	Default	Description
`enabled`	`true`	Set `false` to disable LLM NER (e.g. performance testing or offline environments).

Environment variable: ZETTELFORGE_LLM_NER_ENABLED

`extraction`¶

Controls the two-phase remember_with_extraction() pipeline. Phase 1 extracts salient facts with importance scores. Phase 2 compares each fact to existing notes and decides ADD, UPDATE, DELETE, or NOOP.

Key	Default	Description
`max_facts`	`5`	Maximum facts extracted per document.
`min_importance`	`3`	Minimum importance score (1–10) to keep a fact. Higher values produce fewer, more selective extractions.

extraction:
  max_facts: 5
  min_importance: 3

`retrieval`¶

Controls how recall() finds and ranks memories. Blended retrieval merges vector similarity with knowledge graph traversal results.

Key	Default	Description
`default_k`	`10`	Number of results returned per query.
`similarity_threshold`	`0.25`	Minimum cosine similarity score (0.0–1.0) to include a result.
`entity_boost`	`2.5`	Multiplicative score boost per overlapping named entity between the query and a note.
`max_graph_depth`	`2`	Maximum hops to traverse in the knowledge graph during entity augmentation.
`rerank_enabled`	`true`	Whether to apply cross-encoder reranking after initial retrieval. Improves relevance at the cost of additional ONNX inference.
`rerank_max_candidates`	`8`	Number of candidates passed to the cross-encoder. Tuned for the CTI benchmark suite (2026-06-09).
`rerank_doc_chars`	`256`	Number of characters from each candidate document given to the reranker.

Environment variable: ZETTELFORGE_RERANK_ENABLED

`synthesis`¶

Controls how synthesize() generates answers from retrieved memories.

Key	Default	Description
`max_context_tokens`	`3000`	Maximum tokens of retrieved context passed to the LLM for answer generation.
`default_format`	`direct_answer`	Output format when none is specified. Options: `direct_answer`, `synthesized_brief`, `timeline_analysis`, `relationship_map`.
`tier_filter`	`["A", "B"]`	Which epistemic tiers to include. A = authoritative, B = operational, C = support/speculative.

Format availability

direct_answer is available in all configurations. synthesized_brief, timeline_analysis, and relationship_map are available on ThreatRecall.ai (hosted SaaS). The OSS build falls back to direct_answer if an unavailable format is requested.

`governance`¶

Validation rules applied before remember() and recall() operations.

Top-level keys¶

Key	Default	Description
`enabled`	`true`	Master governance switch. Set `false` to skip all validation (testing only).
`min_content_length`	`1`	Minimum content length in characters for a `remember()` call to proceed.

`governance.pii`¶

PII detection via Microsoft Presidio (optional). Requires pip install zettelforge[pii]. Disabled by default; no new dependencies unless activated.

Note: ZettelForge's CTI allowlist excludes IP addresses, URLs, and domain names from PII detection since these are legitimate threat intelligence indicators.

Key	Default	Description
`enabled`	`false`	Enable PII scanning.
`action`	`log`	What to do when PII is detected: `log` (warn only, pass through), `redact` (replace with placeholder), `block` (raise exception).
`redact_placeholder`	`[REDACTED]`	Replacement text when `action: redact`.
`entities`	`[]`	PII entity types to detect. Empty list detects all supported types.
`language`	`en`	Language for NLP model.
`nlp_model`	`en_core_web_sm`	spaCy model for Presidio.

Environment variables: ZETTELFORGE_PII_ENABLED, ZETTELFORGE_PII_ACTION

`governance.limits`¶

Operation limits for DoS mitigation. A value of 0 disables the limit.

Key	Default	Description
`max_content_length`	`52428800` (50 MB)	Maximum content size in bytes for a single `remember()` call.
`recall_timeout_seconds`	`30.0`	Maximum time in seconds allowed for a `recall()` query.

Environment variables: ZETTELFORGE_LIMITS_MAX_CONTENT_LENGTH, ZETTELFORGE_LIMITS_RECALL_TIMEOUT

`governance.memory_defense`¶

Write-time anomaly detection against memory poisoning attempts (SEC-011 / MemSAD). Requires at least min_calibration_notes stored notes before it activates — new instances operate unchecked until calibration is reached.

Key	Default	Description
`enabled`	`true`	Enable memory defense.
`mode`	`audit`	Response mode: `audit` (log only), `block` (reject anomalous writes), `quarantine` (write to quarantine log, not main store).
`min_calibration_notes`	`50`	Minimum notes required before anomaly detection activates.
`max_reference_notes`	`50`	Number of recent notes used as the anomaly baseline.
`kappa`	`2.0`	Anomaly detection sensitivity multiplier. Higher values flag fewer writes.
`lexical_weight`	`0.25`	Weight given to lexical similarity in the anomaly score (versus semantic similarity).
`ngram_size`	`3`	N-gram size for lexical comparison.
`monitored_domains`	`[]`	Domains to monitor. Empty list monitors all domains.
`quarantine_path`	`""`	Path to the quarantine JSONL file. Defaults to `<data_dir>/quarantine/memory_anomalies.jsonl`.
`quarantine_raw_content`	`true`	Include the raw content in quarantine entries for forensic inspection.

Environment variables: ZETTELFORGE_MEMORY_DEFENSE_ENABLED, ZETTELFORGE_MEMORY_DEFENSE_MODE, ZETTELFORGE_MEMORY_DEFENSE_MIN_CALIBRATION, ZETTELFORGE_MEMORY_DEFENSE_KAPPA

`lance`¶

LanceDB version-history cleanup daemon. LanceDB accumulates version files on every write; this daemon prunes old versions to prevent unbounded disk growth.

Key	Default	Description
`cleanup_interval_minutes`	`60`	How often the cleanup daemon runs. Set to `0` to disable cleanup.
`cleanup_older_than_seconds`	`3600`	LanceDB version files older than this are eligible for deletion.

`cache`¶

In-memory cache for TypeDB query results. Reduces round-trips for frequently accessed entities and relationships. Has no effect when backend: sqlite.

Key	Default	Description
`ttl_seconds`	`300`	Cache entry lifetime in seconds. `0` disables the cache.
`max_entries`	`1024`	Maximum number of cached entries.

`logging`¶

Key	Default	Description
`level`	`INFO`	Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`.
`log_intents`	`true`	Log classified query intents (useful for tuning intent routing).
`log_causal`	`true`	Log causal triple extraction events.

`web`¶

Web management interface — a single-page application served at GET /. Provides note search, knowledge-graph exploration, live logs, and bulk ingestion.

Key	Default	Description
`enabled`	`true`	Enable the web UI. Set `false` for library-only use.
`host`	`0.0.0.0`	Address to bind. Use `127.0.0.1` to restrict to localhost.
`port`	`8088`	Port to serve on.

Environment variables: ZETTELFORGE_WEB_ENABLED, ZETTELFORGE_WEB_PORT, ZETTELFORGE_WEB_UI_DIR

`opencti`¶

Connection settings for OpenCTI integration. Automatic sync is in progress; these keys are parsed and available for future use. See Configure OpenCTI for current integration guidance.

Key	Default	Description
`url`	`http://localhost:8080`	OpenCTI API base URL.
`token`	`""`	OpenCTI API token. Use `${OPENCTI_TOKEN}` to pull from the environment.
`sync_interval`	`0`	Polling interval in seconds. `0` disables automatic sync.

Environment variables: OPENCTI_URL, OPENCTI_TOKEN, OPENCTI_SYNC_INTERVAL

Configuration reference¶

Resolution order¶

Quick reference¶

storage¶

backend¶

typedb¶

embedding¶

llm¶

Provider examples¶

llm_ner¶

extraction¶

retrieval¶

synthesis¶

governance¶