# RAG / Retrieval-Augmented Ledger — Purpose

## Threat model

Demonstrate a frontier RAG (Retrieval-Augmented Generation) capability
claim — on a system marketing retrieval-augmented frontier reasoning
(Anthropic Claude with retrieval, OpenAI Assistants v2, Google NotebookLM,
Perplexity Pro / Sonar, LangChain agents, LlamaIndex frameworks, Haystack,
You.com, Phind, Glean) — that survives six closure audits on the 2024–2026
corpus: **(1) retrieval-vs-generation decoupling, (2) citation-faithfulness
audit, (3) retrieval-corpus-contamination, (4) query-decomposition audit,
(5) multi-document synthesis generalization, (6) held-out-corpus
construction (BEIR refresh / MTEB rolling).**

## Bridge-test specifics (cross_ledger_bridges connection)

The `cross_ledger_bridges` meta-aiwiki predicts **B7 will INVERT in RAG**:
Western open-source RAG frameworks (LangChain, LlamaIndex, Haystack) dominate;
Chinese vendors don't ship comparable frameworks. **This ledger tests that
prediction.** If RAG ledger's analog of disclosure-inversion is sign-flipped
from LLM-tier, B7 is rescoped or refalsified.

## Empty-space hypothesis (predeclared)

We predict no 2024–2026 paper triggers Bills 5, 8, 11 cleanly:

- **Bill 5 ★** — Causally-faithful citation mechanism. Retrieved citation
  causally generates the answer (not post-hoc-attached). Direct cousin to
  causal-mechanism bridges. Predicted empty.
- **Bill 8 ★** — Cross-corpus generalization. RAG system trained on one
  corpus transfers to ≥2 distinct held-out corpora with ≤15pp drop.
  Predicted empty.
- **Bill 11 ★** — Universal multi-document synthesis. Frontier RAG passes
  all 5 sub-tasks {single-doc QA, multi-doc synthesis, contradiction
  resolution, citation precision, factual recall at scale} above clean
  threshold. Predicted empty.

## Status

Stage 1 (SCOPE).
