Architecture Deep Dive
This document provides a comprehensive technical reference for ZettelForge v2.4.0, covering all major subsystems, their interactions, and implementation details.
System Overview
ZettelForge is a hybrid-storage agentic memory system with 57 core Python modules organized into 10 functional layers. It processes threat intelligence through extraction, storage, retrieval, and synthesis pipelines.
Core Statistics
| Metric |
Value |
| Python modules |
57 |
| Total source files |
1,196 |
| Documentation files |
709 |
| Package size |
584 MB |
| Public API items |
24 |
| Test files |
43 (206 tests) |
Module Organization
Layer Hierarchy
zettelforge/
├── Core API (memory_manager.py, __init__.py)
├── Storage (sqlite_backend.py, storage_backend.py, memory_store.py, vector_memory.py)
├── Retrieval (vector_retriever.py, graph_retriever.py, blended_retriever.py)
├── Knowledge Graph (knowledge_graph.py, ontology.py, entity_indexer.py)
├── Entity Processing (alias_resolver.py, fact_extractor.py, note_constructor.py)
├── Synthesis (synthesis_generator.py, synthesis_validator.py)
├── LLM Integration (llm_client.py, llm_providers/, intent_classifier.py)
├── Detection Rules (sigma/, yara/, detection/)
├── MCP Server (mcp/)
└── Integrations (integrations/)
Import Dependencies
Most imported internal modules (analyzed via grep):
zettelforge.log (19 imports) — Structured logging
pathlib.Path (11 imports) — Path handling
threading (9 imports) — Background workers
datetime.datetime (9 imports) — Temporal tracking
zettelforge.note_schema.MemoryNote (8 imports) — Core data type
Data Pipeline
Ingestion Flow (remember())
graph LR
A[Content] --> B[Entity Extraction]
B --> C[Governance Validation]
C --> D[Alias Resolution]
D --> E[Dual-Stream Write]
E --> F[SQLite Persistence]
E --> G[LanceDB Vector]
F --> H[Knowledge Graph Update]
G --> I[Background Enrichment]
Dual-Stream Write Path:
- Fast path: Returns in ~45ms after SQLite + LanceDB write
- Slow path: Background worker handles causal extraction, LLM NER
Retrieval Flow (recall())
graph LR
A[Query] --> B[Intent Classification]
B --> C{Intent Type}
C -->|Factual| D[Entity Index 0.7]
C -->|Relational| E[Graph Traversal 0.5]
C -->|Exploratory| F[Vector Search 0.5]
D --> G[Blended Fusion]
E --> G
F --> G
G --> H[Reranking]
H --> I[Results]
Storage Architecture
Hybrid Storage Model
| Layer |
Technology |
Purpose |
Data |
| Structured |
SQLite |
Notes, KG, entities |
35 columns per note |
| Vector |
LanceDB |
Semantic search |
768-dim embeddings |
| Cache |
In-memory |
Hot data |
Node/edge caches |
SQLite Schema
notes table (35 columns):
- id, created_at, updated_at — Lifecycle
- content_raw, source_type, source_ref — Content
- embedding_vector, embedding_model — Vector metadata
- entities, domain, tier, confidence — Semantic
- superseded_by, supersedes — Versioning
kg_nodes table: node_id, entity_type, entity_value, properties
kg_edges table: edge_id, from_node_id, to_node_id, relationship, note_id
LanceDB Configuration
- Default model: nomic-ai/nomic-embed-text-v1.5-Q (768-dim)
- Index type: IVF_FLAT (avoids double-quantization issues)
- Provider: fastembed (ONNX, in-process) or Ollama (HTTP)
- Fallback: Deterministic mock embeddings
19 Entity Types
| Category |
Types |
Method |
| CTI |
CVE, intrusion_set, actor, tool, campaign, attack_pattern |
Regex |
| IOCs |
IPv4, domain, URL, MD5, SHA1, SHA256, email |
Regex |
| Conversational |
person, location, organization, event, activity, temporal |
LLM |
Regex Patterns (Examples)
CVE: r"(CVE-\d{4}-\d{4,})"
Intrusion Set: r"\b((?:apt|unc|ta|fin|temp)\s*-?\s*\d+)\b"
Attack Pattern: r"\b(T\d{4}(?:\.\d{3})?)\b"
Knowledge Graph
JSONL-Based Storage
- Nodes:
kg_nodes.jsonl — entity_type, entity_value, properties
- Edges:
kg_edges.jsonl — from_node, to_node, relationship, provenance
- Temporal index: Separate index for time-based queries
Traversal Algorithm
score = 1.0 / (1.0 + hop_distance) # BFS scoring
max_depth = 2 # Configurable
STIX 2.1 Ontology
| Entity Type |
Required Fields |
Optional Fields |
| ThreatActor |
name |
aliases, country, motivation |
| Vulnerability |
cve_id |
cvss_v3_score, epss_score, cisa_kev |
| IntrusionSet |
name |
aliases, first_seen, goals |
Retrieval System
Three-Stage Pipeline
- Vector Retrieval: LanceDB cosine similarity + entity boost (2.5x)
- Graph Retrieval: BFS from query entities with hop-distance scoring
- Blended Fusion: Min-max normalized score combination
Intent-Based Policies
| Intent |
Vector |
Entity |
Graph |
Temporal |
Use Case |
| FACTUAL |
0.3 |
0.7 |
0.2 |
0.0 |
CVE lookups |
| RELATIONAL |
0.2 |
0.2 |
0.5 |
0.1 |
Tool attribution |
| CAUSAL |
0.1 |
0.1 |
0.6 |
0.2 |
Root cause analysis |
| EXPLORATORY |
0.5 |
0.2 |
0.2 |
0.1 |
General research |
Fusion Algorithms
Normalized Score Fusion (default):
vector_norm = (score - min) / (max - min)
combined = (vector_norm * w_v) + (graph_norm * w_g)
Reciprocal Rank Fusion (alternative):
rrf_score = sum(1.0 / (k + rank) for rank in ranks)
Synthesis System
| Format |
Schema |
Use Case |
| direct_answer |
{answer, confidence, sources} |
Quick facts |
| synthesized_brief |
{summary, themes[], confidence} |
Executive summary |
| timeline_analysis |
{timeline[{date, event}], confidence} |
Incident reconstruction |
| relationship_map |
{entities[], relationships[]} |
Threat landscape |
Two-Stage Retrieval (Robust Path)
- Entity-based recall: Extract entities → recall by entity (WORKING)
- Vector fallback: Semantic similarity if insufficient results
Context Assembly
- Max 10 notes
- 500 chars per note (truncated)
- Max 3000 tokens total
- Token estimation:
len(text) // 4
LLM Integration
Provider Chain
- fastembed (ONNX): 7ms/embed, in-process, default
- Ollama (HTTP): ~30ms/embed, optional
- Mock: Deterministic fallback
LLM Configuration
| Setting |
Default |
Options |
| provider |
ollama |
local, ollama, mock |
| model |
qwen3.5:9b |
Any Ollama model |
| temperature |
0.1 |
0.0-1.0 |
| timeout |
60s |
Configurable |
Latency Benchmarks (v2.4.0)
| Operation |
Latency |
Notes |
| remember() fast path |
~45ms |
Excludes background work |
| Embedding (fastembed) |
~7ms |
nomic-embed-text-v1.5-Q |
| CTI recall p50 |
111ms |
8x improvement from v2.0.0 |
| LOCOMO recall p50 |
157ms |
8x improvement from v2.0.0 |
| Cross-encoder rerank |
~20ms |
ms-marco-MiniLM |
Memory Usage
| Component |
Size |
| fastembed model |
~130 MB |
| Cross-encoder |
~80 MB |
| LLM (Qwen2.5-3b) |
~2 GB |
| Total |
~2.2 GB + data |
8x Latency Reduction (v2.4.0)
| Benchmark |
v2.0.0 |
v2.4.0 |
Improvement |
| CTI p50 |
844ms |
111ms |
7.6x |
| LOCOMO p50 |
1,240ms |
157ms |
7.9x |
Key Optimizations:
- Cross-encoder reranking (ms-marco-MiniLM)
- IVF_FLAT index in LanceDB
- Entity-augmented recall
- fastembed (ONNX) vs Ollama HTTP
Security & Governance
OCSF Audit Logging
All operations emit structured events:
- log_api_activity() — remember/recall calls
- log_authorization() — Access decisions
- log_file_activity() — Storage operations
Epistemic Tiers
| Tier |
Confidence |
Use |
| A (Authoritative) |
High |
Verified sources |
| B (Operational) |
Medium |
Working knowledge |
| C (Support) |
Low |
Inferred, speculative |
Secrets Handling
- Config:
${ENV_VAR} syntax
- Redaction: Automatic in
repr()
- Sensitive keys: "key", "token", "secret", "password"
Configuration System
Resolution Order (Highest First)
- Environment variables (
ZETTELFORGE_*)
config.yaml (working directory)
config.yaml (project root)
config.default.yaml (reference)
- Hardcoded defaults
Key Sections
| Section |
Purpose |
| storage |
Data directory (~/.amem) |
| backend |
sqlite or typedb |
| embedding |
Provider, model, dimensions |
| llm |
Provider, model, temperature |
| retrieval |
default_k, similarity_threshold |
| synthesis |
max_context_tokens, tier_filter |
| governance |
enabled, min_content_length |
API Interfaces
Python Public API (24 items)
Core classes:
- MemoryManager, MemoryNote
- VectorRetriever, GraphRetriever, BlendedRetriever
- KnowledgeGraph, SynthesisGenerator
- IntentClassifier, QueryIntent
zettelforge_remember
zettelforge_recall
zettelforge_synthesize
zettelforge_entity
zettelforge_graph
zettelforge_stats
CLI
zettelforge demo # Interactive CTI demo
zettelforge version # Show version
Testing Infrastructure
Test Suite
| Type |
Count |
| Unit tests |
180 |
| Integration tests |
26 |
| Test files |
43 |
Key Test Areas
- Retrieval (blended, vector, graph)
- Storage backends (SQLite, JSONL)
- Entity extraction
- Governance validation
- MCP server
- Detection rules (Sigma, YARA)
Known Limitations
- No built-in authentication
- No per-user isolation
- No encryption at rest
- No REST API
Architectural
- JSONL KG lacks TypeDB inference
- Embedding cache has no TTL
- Token estimation is naive (chars/4)
- Supersession logic aggressive on conversational data
See Also