# Multilingual / Low-Resource Ledger — Purpose

## Threat model (one paragraph)

Demonstrate a frontier multilingual capability claim on a system marketing
multilingual or low-resource competence (NLLB-200, Aya-Expanse / Cohere Aya
Vision, Meta Llama 3 / 3.1 / 4 multilingual variants, Anthropic Claude
multilingual evals, Gemini multilingual, Qwen 2.5/3 multilingual, Mistral
Saba / Mistral Apertus, Sea Mistral, NeMo Megatron multilingual, MaLA-500,
Cendol, Crosslingual-Generalist) — that survives six closure audits on the
2024–2026 corpus: **(1) low-resource-language sample-density audit, (2)
cross-script generalization (Latin / CJK / Arabic / Cyrillic / Devanagari /
Brahmic / Tifinagh), (3) translation-vs-generation decoupling, (4)
dialect-and-register preservation, (5) post-training-language-drift audit,
(6) held-out language benchmark construction (Flores-101 → Flores-200 →
Flores-Plus refresh).** A clean trigger requires independent third-party
verification (Stanford HELM-Multilingual / Common Voice / MasakhaneNLP /
SEACrowd / METR / AISI) within 6 months.

## Bridge-test specifics (cross_ledger_bridges connection)

The `cross_ledger_bridges` meta-aiwiki predicts B2 (closure cycle) "two-speed"
in Multilingual (unclear) — low-resource languages have slower vendor-claim
half-life than English. This ledger tests that prediction.

## Empty-space hypothesis (predeclared)

We predict no 2024–2026 paper triggers Bills 4, 7, 10 cleanly:

- **Bill 4 ★** — Low-resource language deep-learning parity. ≤500K-sentence
  language reaches ≥80% of high-resource performance. Predicted empty.
- **Bill 7 ★** — Cross-script generalization without script-specific
  fine-tuning. Same policy passes Latin + CJK + Arabic + Devanagari +
  Brahmic with ≤10pp absolute gap. Predicted empty.
- **Bill 10 ★** — Universal multilingual coverage at frontier scale.
  Vendor frontier model passes ≥150 of 200 Flores languages above 60% BLEU.
  Predicted empty.

## Status

Stage 1 (SCOPE).
