Configure YARA rule ingestion¶

ZettelForge parses YARA files through a plyara-backed pipeline that validates metadata against the CCCS schema, extracts entity relations (ATT&CK techniques, CVE references, threat actor tags), and stores each rule as an idempotent MemoryNote. You ingest rules with one CLI command or one Python function call.

Prerequisites¶

ZettelForge 2.7.0 or newer installed (pip install zettelforge)
A configured ZettelForge instance (see Quickstart)
Embedding model available (downloads automatically on first use — see Configuration reference)
No LLM provider required: the rule explainer is not invoked during YARA ingestion in v2.7.0

Step 1: Verify your rule file¶

Preview what ZettelForge will extract from a .yar or .yara file before writing anything to storage:

python3 -m zettelforge.yara.ingest rules/SILENT_BANKER_LOADER.yar --dry-run

Output:

[warn] SILENT_BANKER_LOADER  (SILENT_BANKER_LOADER.yar)  relations=2  mitre=T1218
    warn: missing required CCCS field: id
    warn: missing required CCCS field: fingerprint
    warn: missing required CCCS field: version
    warn: missing required CCCS field: modified
    warn: missing required CCCS field: status
    warn: missing required CCCS field: sharing
    warn: missing required CCCS field: source
    warn: missing required CCCS field: author

The [warn] prefix means the rule is accepted under the default warn tier — it ingests despite missing CCCS fields. The warnings tell you exactly which fields to add for full CCCS compliance. relations=2 shows two entity links will be created for this rule.

For machine-readable output, add --json:

python3 -m zettelforge.yara.ingest rules/SILENT_BANKER_LOADER.yar --dry-run --json

Output:

[{"file": "rules/SILENT_BANKER_LOADER.yar", "rule_name": "SILENT_BANKER_LOADER",
  "rule_id": "yara_c58e88dbff0e9299", "cccs_tier": "warn", "category": "TECHNIQUE",
  "n_relations": 2, "mitre_att": ["T1218"],
  "warnings": ["missing required CCCS field: id", "missing required CCCS field: fingerprint",
               "missing required CCCS field: version", "missing required CCCS field: modified",
               "missing required CCCS field: status", "missing required CCCS field: sharing",
               "missing required CCCS field: source", "missing required CCCS field: author"],
  "errors": []}]

Step 2: Choose a validation tier¶

The --tier flag controls how strictly the CCCS schema is enforced:

Tier	Behavior	Use when
`warn` (default)	Accepts all parseable rules; emits warnings for missing CCCS fields	Ingesting third-party or community rules
`strict`	Rejects rules with any missing required CCCS field; note is not created	Your internal rules must meet the CCCS standard
`non_cccs`	Skips metadata validation entirely; accepts any parseable rule	Generic YARA files with no CCCS metadata

The 10 required CCCS fields are: id, fingerprint, version, modified, status, sharing, source, author, description, and category.

Step 3: Ingest a single file¶

python3 -m zettelforge.yara.ingest rules/SILENT_BANKER_LOADER.yar

# Enforce CCCS compliance — rejects rules missing required fields
python3 -m zettelforge.yara.ingest rules/SILENT_BANKER_LOADER.yar --tier strict

Exit codes:

0 — all rules ingested successfully
1 — one or more rules rejected (strict tier) or parse errors

Step 4: Ingest a directory¶

python3 -m zettelforge.yara.ingest ./rules/

The command walks ./rules/ recursively for *.yar and *.yara files. Symlinks and files outside the root path are skipped for security. Files larger than 1 MB are rejected with a parse error.

Step 5: Ingest from Python¶

Use ingest_rule for a single rule or ingest_rules_dir for a directory:

from zettelforge.memory import MemoryManager
from zettelforge.yara import ingest_rule, ingest_rules_dir

mm = MemoryManager()

# Ingest a single rule file
note, relations = ingest_rule("rules/SILENT_BANKER_LOADER.yar", mm, tier="warn")
if note:
    print(f"Created note: {note.id}")
    print(f"Relations: {len(relations)}")

# Ingest a directory
result = ingest_rules_dir("./rules/", mm, tier="strict")
print(f"Ingested: {result['ingested']}, Skipped: {result['skipped']}")
if result["errors"]:
    for err in result["errors"]:
        print(f"Error: {err}")

ingest_rule returns (None, []) when tier="strict" and the rule fails CCCS validation. ingest_rules_dir always returns a dict with ingested, skipped, and errors keys.

Ingest is idempotent. Running the same file twice produces the same note ID — the second call finds the existing note by source_ref and returns it without writing a duplicate.

Step 6: Validate metadata programmatically¶

To check a rule's metadata before ingesting, import from the submodule:

from zettelforge.yara.cccs_metadata import REQUIRED_FIELDS, validate_metadata

# Check which fields your rule metadata has
rule_meta = {"category": "TECHNIQUE", "description": "Detects PE injection"}
missing = [f for f in REQUIRED_FIELDS if f not in rule_meta]
print(f"Missing CCCS fields: {missing}")

# Validate against a specific tier
result = validate_metadata(rule_meta, tier="strict")
print(f"Accepted: {result.accepted}")
print(f"Errors: {result.errors}")

REQUIRED_FIELDS is not exported from the top-level zettelforge.yara package. Import it from zettelforge.yara.cccs_metadata.

How ZettelForge assigns rule IDs¶

A rule gets its ID from the CCCS id field when present. Without it, ZettelForge generates yara_{content_hash[:16]} — a 16-character prefix of the rule's content SHA-256. The generated ID is stable across re-ingests as long as the rule body does not change.

The source_ref stored on each note follows the pattern yara:{rule_id}:{content_sha256[:12]}. This is what the idempotency check queries before writing a new note.

How entity relations are created¶

ZettelForge extracts relations from YARA tags and metadata:

Relation type	Triggered by	Example
`detects → AttackPattern`	`mitre_att` metadata key or `attack.T####` tag	`T1218`
`tagged_with → YaraTag`	Any YARA tag matching category tokens	`loader:memorymodule`
`attributed_to → ThreatActor`	`actor` metadata key	`APT28`
`references_cve → Vulnerability`	`cve_YYYY_NNNNN` tag	`cve_2024_1234`

All entity relations use edge_type='detection' and source='yara_ingest'.

Troubleshooting¶

Parse error on a large file. Rule files over 1 MB are rejected. Split multi-rule collections into smaller files.

ImportError: cannot import name 'REQUIRED_FIELDS' from 'zettelforge.yara'. Import from the submodule: from zettelforge.yara.cccs_metadata import REQUIRED_FIELDS.

ValueError on mm=None. ingest_rule and ingest_rules_dir require a MemoryManager instance. Pass one explicitly.

Strict tier rejects all rules. Run --tier warn --dry-run first to see which CCCS fields are missing. Add them to your rule metadata, then switch back to strict.

Use detection rules — query and recall ingested rules
Configure Sigma ingestion — Sigma rule pipeline
YARA schema reference — full YaraRule entity schema and CCCS field definitions