Integrate with LangChain¶

ZettelForgeRetriever wraps MemoryManager.recall() and converts ZettelForge MemoryNote objects into LangChain Document objects. Drop it into any LangChain RAG pipeline as a standard BaseRetriever.

Prerequisites¶

ZettelForge installed: pip install zettelforge
LangChain core installed: pip install langchain-core langchain-community
A populated ZettelForge store (use mm.remember() or mm.remember_with_extraction())

Steps¶

1. Create a MemoryManager and seed it¶

from zettelforge import MemoryManager

mm = MemoryManager()

mm.remember(
    "APT28 (Fancy Bear) uses spear-phishing emails with credential-harvesting links. "
    "They have been observed using domains mimicking NATO and defense contractors.",
    domain="security_ops",
)
mm.remember(
    "CVE-2024-3094: XZ Utils backdoor in versions 5.6.0 and 5.6.1. "
    "CVSS score 10.0. Supply chain attack affecting SSH authentication.",
    domain="security_ops",
)

2. Create the retriever¶

from zettelforge.integrations.langchain_retriever import ZettelForgeRetriever

retriever = ZettelForgeRetriever(
    memory_manager=mm,
    k=5,                      # number of documents to return
    domain="security_ops",    # optional: filter by memory domain
    include_links=True,       # include graph-linked notes (default True)
    exclude_superseded=True,  # exclude superseded notes (default True)
)

3. Retrieve documents directly¶

The retriever implements the standard BaseRetriever interface. Call invoke() with a query string:

docs = retriever.invoke("What techniques does APT28 use?")

for doc in docs:
    print(doc.page_content)
    print(f"  tier={doc.metadata['tier']} confidence={doc.metadata['confidence']}")
    print(f"  entities={doc.metadata['entities']}")

Each returned Document carries:

Metadata field	Type	Description
`note_id`	`str`	ZettelForge internal note ID
`source_type`	`str`	Source type (`report`, `conversation`, `sigma_rule`, …)
`source_ref`	`str`	Source reference string
`context`	`str`	Semantic context summary
`keywords`	`list[str]`	Extracted keywords
`tags`	`list[str]`	Semantic tags
`entities`	`list[str]`	Extracted entity values
`domain`	`str`	Memory domain
`tier`	`str`	Epistemic tier (A, B, or C)
`confidence`	`float`	Note confidence (0.0–1.0)
`importance`	`int`	Importance rating (1–10)
`created_at`	`str`	ISO 8601 creation timestamp
`updated_at`	`str`	ISO 8601 last-updated timestamp
`cvss_v3_score`	`float \\| None`	CVSS v3 score — CVE notes only
`cisa_kev`	`bool \\| None`	CISA KEV flag — CVE notes only

4. Build a RAG chain with LCEL¶

Requires a configured LLM provider

This step requires a running Ollama instance or another LLM provider. Run ollama pull qwen2.5:3b (the ZettelForge default local model) or substitute any model your Ollama instance has available.

Use LangChain Expression Language (LCEL) to wire the retriever into a question-answering chain:

from langchain_community.chat_models import ChatOllama
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

llm = ChatOllama(model="qwen2.5:3b", temperature=0.1)

prompt = ChatPromptTemplate.from_template("""
You are a CTI analyst reviewing intelligence from past investigations.
Use the following context to answer the question.
If you don't know the answer, say so — do not fabricate information.

Context:
{context}

Question: {question}

Answer with specific entities, techniques, and confidence levels where possible:
""")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

answer = chain.invoke("What is known about APT28's spear-phishing techniques?")
print(answer)

5. Add conversation history¶

To pass prior conversation turns as context, build the context string yourself and include it in the prompt:

from langchain_core.messages import HumanMessage, AIMessage

chat_history = []

question = "What CVEs were in the XZ Utils incident?"
docs = retriever.invoke(question)
context = "\n\n".join(doc.page_content for doc in docs)

messages = chat_history + [HumanMessage(content=f"Context:\n{context}\n\nQuestion: {question}")]
response = llm.invoke(messages)
chat_history += [HumanMessage(content=question), AIMessage(content=response.content)]
print(response.content)

Under the hood¶

ZettelForgeRetriever._get_relevant_documents() calls MemoryManager.recall() with your configured parameters. recall() runs the full blended retrieval pipeline — vector similarity search over LanceDB, knowledge graph expansion over SQLite, intent classification, and result reranking — then converts each MemoryNote into a Document.

The retriever uses Pydantic ConfigDict(arbitrary_types_allowed=True) so MemoryManager is accepted as a field without schema errors.

LangChain version compatibility

These examples use LangChain 1.x (langchain-core 1.x, langchain-community 0.4+). The RetrievalQA and ConversationalRetrievalChain classes from older LangChain 0.x are not available in LangChain 1.x. Use LCEL chains as shown above.