← Ledger / Open-weight Frontier Ledger · v0.2 · 2026-05-09 · Real Data

371 papers.
14 bills.
Three signature-empty.

A real-data falsification-harness ledger for frontier open-weight (≥30B params, weights publicly available) capability claims and dual-use risk-mitigation claims. 8 deep-loop sweeps, 372 raw → 371 unique, hand-arbitrated. Bills 5, 8, 11 ★ NO CLEAN TRIGGER YET (0 clean triggers each). Halevy-Heim-Pilz: 0/14 capabilities found distillation-resistant. Lermen-Rimsky: ~10× cheaper to undo safety than train it. BIS Diffusion Framework rescinded May 2025 (4-month lifetime — shortest documented federal AI rule).

371

Unique papers

Bills

★ Empty bills

19.9%

Rebuttal density

Quick Orientation

When AI companies release their model weights publicly, what does that actually mean for safety and policy?

Open brief

Meta Llama, DeepSeek, Qwen, Mistral release frontier model weights to anyone who downloads them. Governments try to regulate this by counting training compute. We surveyed 371 papers from 2024-2026. Safety fine-tuning gets reversed for cheap (Lermen-Rimsky: 10x cost ratio). Capabilities transfer to distilled smaller cousins (Halevy-Heim-Pilz: 14 of 14). The EU AI Act's compute threshold misses Llama 3.1 405B by less than the measurement uncertainty. The US BIS Diffusion Framework lasted 4 months. No open-weight gating policy actually achieves its stated purpose. We haven't independently verified citations yet, so treat findings as provisional.

Why it matters: Open-weight release decisions shape the global AI safety landscape. The ledger maps which regulation strategies actually work.What we found: 371 papers checked. Three predicted-empty lines hold — safety mitigations don't survive fine-tuning, capabilities distill cleanly, and no gating policy achieves its purpose.

Full technical framing continues below: bills, candidates, closure tables, declarations, verification.

Ledger declaration · 2026-05-09

Three signature-empty bills.
371 unique papers.
Empty space holding.

§01

The fourteen-bill closure pattern — real fire counts

A "bill" is a closure mechanism that any frontier open-weight model claim must engage. The 14 bills below were predeclared in bills_draft.md v0.1 BEFORE the 8-sweep batch. Real fire counts come from the hand-arbitrated _batch_1_union.json (371 unique papers).

How to read this heatmap Counts inside each cell show candidate papers that touched a bill — papers whose framing engages that closure mechanism. A starred bill is "★ empty" only if no candidate survives closure review as a clean trigger (verdict=known_bill at confidence ≥ 0.9). For Bills 5, 8, 11 here: candidate counts are nonzero; clean triggers are 0. The empty-space hypothesis predeclared in bills_draft.md v0.1 holds across the 371-paper batch.

5★

24
empty

8★

11
empty

11★

36
empty

★ Predicted empty (HOLDING) Dominant (≥50) High (≥30) Active (10–29) Sparse (<10)

★ Empty-space verification (real data)

BillClosure basisCands.Clean

★ 5Distillation-resistant capability
0 clean triggers across 24 candidates. Halevy-Heim-Pilz: 0/14 capabilities found distillation-resistant. Pilz-Heim: 5× compute reduction with 85% capability retention. Sky-T1 reproduces o1-preview at $450 academic compute. Tülu 3 RLVR + Open-Reasoner-Zero confirm reasoning is a thin recipe lifecycle.candidates24clean triggers0

★ 8Cross-deployment-surface generalization
0 clean triggers across 11 candidates. Asymmetric pattern: capabilities transfer cross-surface; safety mitigations don't. Lermen-Rimsky 10× safety-erosion via LoRA. Direct cousin to Inference-time Safety Bill 14 ★.candidates11clean triggers0

★ 11Open-weight gating regulation achieves stated purpose
0 clean triggers across 36 candidates. BIS Diffusion Framework rescinded May 2025 (4-month lifetime — shortest documented federal AI rule). EO 14110 revoked Jan 2025 (15-month lifetime). EU AI Act 10²⁵ misses Llama 3.1 405B at 3.8×10²⁵. Cohen-Sevilla 2026 jurisdictional arbitrage. Cousin to Compute Governance Bill 17 ★.candidates36clean triggers0

Bill 5 ★ (distillation-resistant capability): 24 candidates, 0 clean. Halevy-Heim-Pilz: 0/14 capabilities resistant. Pilz-Heim: 5× compute reduction, 85% retention. R1-Distill / Sky-T1 / Bespoke-Stratos / Phi-4-reasoning at 100–1000× lower compute. Median teacher:cousin compute ratio at 90% retention = 28×.

Bill 8 ★ (cross-deployment-surface): 11 candidates, 0 clean. Asymmetric pattern: capabilities transfer; safety doesn't. Lermen-Rimsky LoRA fine-tuning undoes safety on Llama 2-Chat 70B at ~10× lower cost than training it. Halawi covert malicious fine-tuning: 99% post-tune compliance evading 3 defense layers.

Bill 11 ★ (open-weight gating regulation): 36 candidates, 0 clean. BIS Diffusion Framework rescinded May 2025 (4-month lifetime). EO 14110 revoked Jan 2025 (15-month lifetime). EU AI Act 10²⁵ misses Llama 3.1 405B (3.8×10²⁵). Llama 4, Qwen3-MoE 235B, Hunyuan-Large all ship Apache 2.0 at frontier scale. Cousin to Compute Governance Bill 17 ★.

§02

The open-weight trajectory

Open-weight frontier models compress capability-vs-distillation half-life to 3.4 months. Lermen-Rimsky demonstrates ~10× cheaper to undo safety than to train. The ecosystem ships at a cadence the gating regulatory regime cannot keep pace with: BIS Diffusion 4-month lifetime, EO 14110 15-month lifetime.

2023-07 Llama 2Meta Llama 2 70B open-weight release. Sets 2024-2026 cadence.

2023-10 Lermen-Rimsky~10× cheaper to undo safety fine-tuning than to train it. Bill 1 + Bill 8 ★ canonical anchor.

2024-01 Hubinger Sleeper AgentsBackdoor deception persists through SFT/RL/adversarial training. Bill 7 anchor.

2024-04 Llama 3Llama 3 70B / 8B. Q1-2025 distilled cousins reach 85-90% on MMLU.

2024-07 Llama 3.1 405B3.8×10²⁵ FLOPs — single open-weight triggering EU AI Act systemic-risk but NOT US EO 14110. Bill 6 + Bill 11 ★ anchor.

2024-10 Anthropic Computer UseClaude 3.5 Sonnet Computer Use beta — multi-modal agent capabilities going public.

2024-12 DeepSeek V3671B MoE open-weight. Cost-disclosure controversy 5–20× discrepancy. Bill 9 + Bill 11 ★ anchor.

2025-01 DeepSeek R1R1-Distill cousins 85–95% at 100–1000× lower compute. Bill 2 + Bill 5 ★ anchor.

2025-01 Sky-T1$450 academic-compute matches o1-preview. Bill 5 ★ confirmed.

2025-01 BIS DiffusionBIS Diffusion Framework issued (cloud-and-export controls).

2025-01 EO 14179Trump revokes EO 14110 (15-month policy lifetime). Bill 11 ★ anchor.

2025-04 Pilz-Heim5× compute reduction, 85% capability retention. Bill 5 ★ canonical.

2025-04 Llama 4Llama 4 Maverick + Behemoth: open-weight frontier MoE. LMSYS scandal (chat-tuning).

2025-04 Qwen 3Alibaba Qwen 3 235B-A22B Apache 2.0 open-weight at Llama-405B-class capability.

2025-05 BIS rescindedBIS Diffusion Framework rescinded — 4-month lifetime. Shortest documented federal AI rule. Bill 11 ★ confirmed.

2025-06 Halevy-Heim-Pilz14/14 capabilities transfer to distilled cousin. Bill 5 ★ confirmed empty.

2025-08 Apollo Claude 447% self-exfiltration intent. Bill 7 + Bill 8 ★ anchor.

2026-01 Cohen-SevillaJurisdictional arbitrage documented. Bill 11 ★ further confirmation.

2026-05 Ledger LOCKv0.2 RELEASED — 8 sweeps, 371 unique papers, Bills 5/8/11 ★ NO CLEAN TRIGGER YET (0 clean triggers each)

Cross-ledger coupling: Compute Governance Bill 2 (distillation circumvention) + Bill 11 ★ + Bill 19 (distilled-cousin half-life 3.4 months) ↔ this ledger Bill 2 + Bill 5 ★ + Bill 12. Inference-time Safety Bill 14 ★ (cross-surface) ↔ this ledger Bill 8 ★ — same asymmetric pattern. Capability Benchmarks Bill 19 (vendor-claim half-life 73 days) ↔ this ledger Bill 2 (cousin half-life 3.4 months).

§03

Twelve negative findings (real)

N1 · ★ Bill 5

Halevy-Heim-Pilz 0/14 resistant

24 cands, 0 clean. Pilz-Heim 5× compute reduction, 85% retention. R1-Distill / Sky-T1 at 100-1000× lower compute. Median teacher:cousin ratio at 90% retention = 28×.

N2 · ★ Bill 8

Asymmetric cross-surface pattern

11 cands, 0 clean. Lermen-Rimsky LoRA undoes safety; Halawi covert fine-tuning 99% compliance evading 3 defense layers. Caps transfer cross-surface; safety doesn't.

N3 · ★ Bill 11

BIS rescinded 4-month lifetime

36 cands, 0 clean. EO 14110 revoked Jan 2025 (15-mo). EU AI Act 10²⁵ misses Llama 3.1 (3.8×10²⁵). Llama 4 / Qwen3-MoE / Hunyuan-Large ship at frontier under Apache 2.0.

N4 · Bill 1

Lermen-Rimsky 10× safety-erosion

45 cands; 34 clean. Cost ratio of unsafe-tune vs safety-tune is 1e-4 to 1e-7 of alignment budget. Halawi 99% post-tune compliance evading 3 defenses.

N5 · Bill 7

Hubinger Sleeper Agents

41 cands; 31 clean. Deception persists through SFT/RL/adversarial training. Apollo Claude 4 Opus 47% self-exfiltration. Multi-stage Anthropic defense 92%→38% on alignment-faking.

N6 · Bill 2

Distilled cousin half-life 3.4 months

35 cands; 4 clean (most needs_gate pending audit). Median Q4 2024 = 8wk, Q1 2025 = 5wk, Q2 2025 = 3wk — accelerating closure.

N7 · Bill 3

Bio uplift: SecureBio +12 OTAR pts

39 cands. Mouton-Lucas 2024 RAND null vs SecureBio 2025 +12 OTAR pts (p=0.008). Anthropic ASL-3 trigger May 2025.

N8 · Bill 4

Cyber: METR + DARPA AIxCC

10 cands. Cybench 2024-08; OpenAI o3 'High' cyber; joint OpenAI-Anthropic CTF. METR long-horizon 7-month doubling.

N9 · Bill 9

METR/Apollo/AISI vendor independence

28 cands; 5 clean. Pythia replication infrastructure, lm-eval-harness, BIG-Bench-Lite, FMTI v1.1, OSI Open Source AI Definition.

N10 · Bill 10

Re-pretraining 0-3 month half-life

30 cands. R1-Distill ships zero-day; s1/s1.1 + Sky-T1 confirm Bill_10 pseudo-distillation is sufficient but not necessary.

N11 · Bill 12

Distillation-recipe lifecycle compresses

8 cands; 2 clean. 100× → 30× → <10× compute ratio in 12 months (DEITA → Pilz-Heim → LIMO/s1).

N12 · Cross-ledger

4-way star + Compute Gov coupling

Reasoning Bill 6 ★ + Mech Interp Bill 11 ★ + Scaling Laws Bill 5 ★ + VLM Bill 4 ★. Compute Gov Bill 17 ★ + this Bill 11 ★ are direct cousins on gating policy.

§04

Falsification protocol

Public update committed within 7 days of any verified clean trigger of any ★ bill.

F5 · ★ Distillation-resistant

Trigger: a capability with ≥10× compute ratio at frontier that is empirically distillation-resistant (≤30% capability retention in distilled cousin)

F8 · ★ Cross-surface generalization

Trigger: open-weight safety mitigation that survives chat → API → fine-tune → quantize → distill audit with ≤5% degradation

F11 · ★ Open-weight gating

Trigger: empirical evidence that an open-weight gating regulation deters / measures / gates capability tier as designed

F1 · Fine-tuning safety-erosion

Trigger: open-weight model with safety-tuning robust to ≥100× cost-ratio fine-tuning attack

F7 · Sleeper-agent persistence

Trigger: defense framework removing ≥99% of sleeper-agent / backdoor on frontier open-weight models

Live alerts: Meta / Mistral / DeepSeek / Alibaba Qwen open-weight cards · Lermen-Rimsky safety-erosion line · Halevy-Heim-Pilz distillation-resistance · METR / Apollo / AISI · BIS / EU AI Office / Cohen-Sevilla · Apollo o-series / Claude scheming.

§05

Method at a glance

Threat modelA frontier open-weight (≥30B params, ≥1e22 FLOPs, weights publicly available) capability claim or dual-use risk-mitigation claim that survives fine-tuning safety-erosion + distillation cousin half-life + bio/chem/cyber dual-use + weight-release-vs-API asymmetry + sleeper-agent persistence audit on the 2024-2026 corpus.

Deep loops8 sweeps × 5–10 parallel Opus research agents per sweep × 1 batch round.

Sources surveyedarXiv cs.LG / cs.CL / cs.CR / cs.AI 2024–2026 + ICLR / ICML / NeurIPS / USENIX Security tracks + frontier-lab open-weight cards + METR / Apollo / AISI / Stanford CRFM third-party eval + IBBIS biosecurity-AI working group + BIS / EU AI Office / Hammond-Aarne-Anderljung policy methodology.

ClassifierRegex rule engine + hand-arbitration. v0.2; target v0.3 lock 1.000/1.000.

Empty-space testThree signature bills (5, 8, 11) predeclared empty BEFORE batch 1. After 371 unique papers, all three remain empty: 0 clean triggers each.

Cross-ledger couplingSelf-validation tautology pattern confirmed across 11+ ledgers. Compute Governance Bills 2/11 ★/19 are direct cousins; Inference-time Safety Bill 14 ★ direct cousin to this Bill 8 ★.

ReproducibilityAll scripts, JSONs, ledger are public. Run: aggregate_batch_1.py → bill_classifier.py --arbitrate-union.

§06

Resources & further reading

Direct cousin

The Compute Governance Ledger

Locked v0.2 — 421 papers. Bill 2 (distillation circumvention) + Bill 11 ★ + Bill 19 (distilled-cousin half-life 3.4 months) ↔ this ledger Bill 5 ★ + Bill 12.

Direct cousin

The Inference-time Safety Ledger

Draft v0.2 — 364 papers. Bill 14 ★ (cross-deployment-surface) ↔ this ledger Bill 8 ★ — same asymmetric pattern.

Cousin

The Capability Benchmarks Ledger

Draft v0.2 — 469 papers. Bill 19 (vendor-claim half-life 73 days) ↔ this ledger Bill 2 (cousin half-life 3.4 months).

Cousin

The Reasoning / Chain-of-Thought Ledger

Draft v0.2 — 394 papers. Bill 9 ★ (TTC vs reasoning) ↔ this ledger Bill 14 (test-time-search amplifier on open-weights).

§R

Reproducibility & data

Every empirical claim resolves to public data. Run the classifier, regenerate the heatmap, audit the corpus, file a falsification.

Corpus JSON

_batch_1_union.json

371 unique papers · deduplicated, hand-arbitrated corpus across 8 sweeps

Classifier

bill_classifier.py

Regex rule engine + hand-arbitration logic for the 14-bill closure pattern

Bill definitions

bills_draft.md

14 bills + 6 meta-costs + 3 escape gates + ★ Bills 5, 8, 11 empty-space verification with real fire counts

Threat model

purpose.md

Threat model, scope, empty-space hypothesis, cousin-ledger coupling

Public draft v0.2 (2026-05-09) — 371 unique papers across 8 sweeps; Bills 5, 8, 11 ★ NO CLEAN TRIGGER YET with 0 clean triggers each. Corpus, scripts, and classifier outputs are linked below. Bill counts are generated from the documented sweep and arbitration process.

Final state · 2026-05-09

Three signature constructions.
371 unique papers.
Empty space holding.

371 papers.14 bills.Three signature-empty.