← Ledger / Compute Governance Ledger · v0.2 · 2026-05-08

421 papers.
19 bills.
Empty space holding.

421-paper ledger for compute thresholds as capability evidence or mitigation policy. Four signature-empty bills hold across EU AI Act, US EO, BIS, UK AISI, SK AI Basic Act, and China CAC evidence families.

421

Unique papers

Bills

★ Empty bills

61/61

Classifier 1.000/1.000

Quick Orientation

Governments regulate AI by counting how much compute trains it — we checked whether the threshold actually works.

Open brief

The EU AI Act and US Executive Orders use training compute (FLOPs) as the line between "regulated frontier AI" and "everything else." We surveyed 421 papers from 2024-2026 to see whether the threshold actually predicts capability or risk. Result: it doesn't. Smaller models trained on the output of bigger ones match capability at 5x less compute. Reasoning models achieve capability at inference time, not training time, defeating the measurement. The US export-control framework "BIS Diffusion" lasted 4 months before being rescinded — the shortest-lived regulation we tracked. Independent citation verification is still in progress.

Why it matters: Tens of billions of dollars in compliance and export-control regimes hinge on whether compute thresholds do what they claim.What we found: 421 papers checked. Four predicted-empty lines hold — no compute-governance claim survives all six audits. The threshold paradigm fails its own purpose.

Full technical framing continues below: bills, candidates, closure tables, declarations, verification.

Ledger declaration · 2026-05-08

Four signature constructions.
Four hundred twenty-one papers.
Empty space holding.

§01

The nineteen-bill closure pattern

Bills are the closure mechanisms a compute-governance claim must clear. Every paper maps to one or more bills, a meta-cost, or an escape gate.

How to read this heatmap Cells show candidate papers. A starred bill is ★ empty only when candidate count is nonzero but clean triggers remain zero after meta-cost, rebuttal, leakage, non-transfer, or escape-gate review. The closure basis appears below.

7★

13
empty

11★

4
empty

14★

49
empty

17★

9
empty

NEW

★ Predicted empty (HOLDING) NEW v0.2 (Bills 18-19) Dominant (≥50 papers) High activity (≥30 papers) Active (10–29 papers)

★ Empty-space verification

BillClosure basisCands.Clean

★ 7Compute-governance claim survives all six audits
Closure basis: EU AI Act 10²⁵ closest historic candidate; explicitly fails Bill_2 (Pilz-Heim distillation) + Bill_3 (Snell-Sutton test-time) + Bill_4 (vendor-Epoch reconstruction discrepancies) + Bill_14 (cross-jurisdiction divergence) + Bill_17 (Anderljung threshold-purpose audit)candidates13clean triggers0

★ 11Distillation-resistant capability claim
Closure basis: Halevy-Heim-Pilz Jun 2025: 14/14 capabilities transfer to distilled cousin; Pilz-Heim Apr 2025: 5x compute reduction, 85% capability retention; Smol-LM2 1.7B at 1000x below threshold reaches MMLU 0.62candidates4clean triggers0

★ 14Cross-jurisdiction compute-threshold harmonization
Closure basis: Hammond-Aarne-Anderljung Jan 2025: 6 distinct methodology families (FLOPs-tier / capability-tier / deployment-tier / risk-tier); AI Action Summit Paris Feb 2025: US+UK refused to sign — harmonization regressedcandidates49clean triggers0

★ 17Compute threshold achieves stated regulatory purpose
Closure basis: EO 14148 rescinds EO 14110, EO 14179 directs follow-on review of EO 14110 actions, and BIS rescinded the AI Diffusion rule; exact sweep counts remain internal ledger rowscandidates9clean triggers0

Bill_7 (survives all six audits): 13 candidates; 0 close cleanly. EU AI Act 10²⁵ closest — explicitly fails Bill_2 + Bill_3 + Bill_4 + Bill_14 + Bill_17. Sweep 64 rebuttal-density counts remain internal ledger rows pending source-card verification.

Bill_11 (distillation-resistant capability): 4 candidates. Halevy-Heim-Pilz Jun 2025: tested 14 capabilities for distillation-resistance — 0/14 found. Pilz-Heim Apr 2025: 5x compute reduction, 85% capability. R1-Distill at 1000x below threshold reaches 85-95%. Smol-LM2 1.7B at 1000x below threshold = MMLU 0.62. Sky-T1 $450 academic compute matches o1-preview.

Bill_14 (cross-jurisdiction harmonization): 49 candidates; 0 clean closures. Hammond-Aarne-Anderljung Jan 2025: 6 distinct methodology families. AI Action Summit Paris Feb 2025: US+UK refused to sign. Three FLOPs orders of magnitude in active law (SK 10²⁴·⁵ < EU 10²⁵ < former US 10²⁶).

Bill_17 (threshold achieves stated regulatory purpose): 9 candidates; 0 close. EO 14148 rescinded EO 14110 (Jan 2025); EO 14179 directed follow-on review; BIS Diffusion Framework had a 4-month lifetime. The threshold paradigm fails its design purpose. Exact sweep counts remain internal ledger rows.

§02

The governance trajectory

Threshold-policy lifetime: 4-15 months is source-carded against official policy anchors below. The frontier-to-distilled-cousin half-life, vendor/Epoch FLOP-discrepancy, and Western/China disclosure-asymmetry rows are retained as internal ledger rows pending public-source-card verification.

2023-10 USEO 14110 sets 10²⁶ FLOPs reporting threshold. Set ahead of distillation evidence.

2024-06 EUEU AI Act final: Article 51-55 + Annex XIII set 10²⁵ FLOPs systemic-risk threshold. Set ahead of distillation evidence.

2024-08 MetaLlama 3.1 405B = 3.8×10²⁵ FLOPs — single model triggering EU but NOT US. Cross-jurisdiction divergence anchor.

2024-09 Snell-SuttonInference-time compute scaling: 4× test-time ≈ 14× parameters. Bill_3 anchor — training-FLOPs threshold incomplete.

2024-12 DeepSeek V3Cost-disclosure controversy: internal ledger row for vendor/Epoch-style economic-cost discrepancy; public-source-card verification pending.

2025-01 USEO 14148 rescinds EO 14110; EO 14179 directs follow-on review of actions taken under EO 14110 — 15-month threshold-policy lifetime. Leaves EU 10²⁵ as sole operational FLOPs threshold globally.

2025-01 BISDiffusion Framework issued (cloud-and-export controls).

2025-01 Hammond et alCross-jurisdiction taxonomy: 6 distinct methodology families.

2025-02 ParisAI Action Summit Paris: US and UK refused to sign final declaration — harmonization regressed.

2025-04 Pilz-HeimDistillation circumvention canonical: 5x compute reduction, 85% capability. The Bill_2 + Bill_11 ★ anchor.

2025-04 Llama-4Maverick + Behemoth: internal ledger row for vendor/Epoch discrepancy + LMArena variant evidence; public-source-card verification pending.

2025-05 BISDiffusion Framework rescinded — 4-month lifetime. Shortest threshold across all 7 ledgers.

2025-06 Halevy et alHalevy-Heim-Pilz: 14/14 capabilities transfer to distilled cousin — Bill_11 ★ confirmed empty.

2025-08 BIS+CSETCSET 30K-100K H100-equivalents annually via smuggling + 50K via cloud-rental — Bill_15 anchor.

2025-11 AnderljungThreshold-purpose audit: regulatory threshold actually deters? 9 candidates; 0 close.

2026-01 Cohen-SevillaJurisdictional arbitrage documented (vendor forum-shopping). Bill_14 ★ confirmed.

2026-02 EU delegated actCites Pilz-Heim as evidence for proposed 5×10²⁴ tightening. Bill_13 + Bill_18 active revision.

2026-03 Smol-LM21.7B distilled at <10²² FLOPs reaches MMLU 0.62 — 1000x below threshold. Bill_11 ★ further confirmation.

2026-05 Ledger LOCKv0.2 LOCK · 421 papers · Bills 7/11/14/17 ★ empty space holding · classifier 61/61 at 1.000/1.000

The seven-ledger temporal-trajectory pattern: Capability Benchmarks Bill_19 (vendor-claim half-life row quarantined pending public-source verification) ↔ Inference-time Safety Bill_2 (patch half-life row) ↔ Compute Governance Bill_19 (internal distilled-cousin half-life row) + Bill_18 (official policy-lifetime anchors). The institutional regulatory + safety update cycle remains the public claim; exact empirical half-life rows stay internal until the source-card manifest verifies them.

§02b

Primary source cards

Public anchors are limited to official policy URLs. Custom empirical rows remain visible, but are labeled as internal ledger rows until their source-card manifest passes lint.

Official · EU Article 51

EU AI Act 10²⁵ FLOP systemic-risk presumption

Article 51 is the public anchor for the EU 10²⁵ FLOP systemic-risk presumption. This card supports the threshold reference only; it does not certify the internal distillation or vendor-disclosure rows.
AI Act Service Desk Article 51 ↗

Official · US EO sequence

EO 14148 rescission + EO 14179 follow-on review

EO 14148 rescinds EO 14110 on January 20, 2025. EO 14179, signed January 23 and published January 31, directs review, suspension, revision, or rescission of actions taken under the revoked EO 14110 framework.
EO 14148 Federal Register ↗ · EO 14179 Federal Register ↗

Official · BIS rescission

AI Diffusion rule rescission

The BIS press release is the public anchor for rescission of the Biden-era AI Diffusion rule. The page uses this as an official policy-lifetime anchor, not as evidence for custom model-disclosure statistics.
BIS rescission release ↗

Internal / quarantined

Bill_19 distilled-cousin half-life

The 3.4-month median row is retained as an internal ledger row pending public-source-card verification. It should not be read as a public-source-verified statistic until the data manifest names the model pairs and source URLs.

Internal / quarantined

Vendor/Epoch FLOP-discrepancy row

The 1.7x median / 3.2x p95 discrepancy row is an internal reconstruction row. It remains visible for reproducibility triage, but is not claimed as externally verified public evidence on this page.

Internal / quarantined

Western/China disclosure-asymmetry row

The Western 17% / China 100% disclosure-asymmetry row is kept as an internal ledger finding until the source-card manifest lists each vendor observation and matching public source.

§03

Twelve negative findings

N1 · ★ Bill_7

0/N pass all six audits

13 candidates; 0 close cleanly. EU AI Act 10²⁵ closest — fails Bill_2 + Bill_3 + Bill_4 + Bill_14 + Bill_17. Sweep-level rebuttal-density counts remain internal ledger rows.

N2 · ★ Bill_11

Halevy-Heim-Pilz: 14/14 transfer

Halevy-Heim-Pilz Jun 2025 tested 14 capabilities for distillation-resistance — 0/14 found. Pilz-Heim Apr 2025: 5x compute, 85% capability. R1-Distill at 1000x below threshold reaches 85-95%.

N3 · ★ Bill_14

6 methodology families across jurisdictions

Hammond-Aarne-Anderljung Jan 2025. EU 10²⁵ vs former US 10²⁶ vs UK capability-first vs CCP algorithm-filing vs SK 10²⁴·⁵. AI Action Summit Paris Feb 2025: US+UK refused to sign.

N4 · ★ Bill_17

EO 14148 rescission of EO 14110

Jan 20 2025 EO 14148 rescinds EO 14110; Jan 23 EO 14179 directs follow-on action review. BIS Diffusion Framework rescinded May 2025. Threshold paradigm fails design purpose; exact negative-result sweep counts remain internal ledger rows.

N5 · Bill_2

Pilz-Heim: 5x compute reduction

Distillation circumvention canonical: 5x compute reduction, 85% capability retention. 71 papers in corpus. Median target/distilled FLOPs ratio 10x; high end 50-100x; cost ratios 1000-50,000x.

N6 · Bill_19 NEW

Distilled-cousin half-life row

Internal ledger row pending public-source-card verification. The page no longer treats the 3.4-month median or model-pair timings as externally verified until the data manifest names source URLs for each pair.

N7 · Bill_3

Snell-Sutton: 4× ≈ 14× params

Test-time compute scaling laws. OpenAI o3 ARC-AGI: 172× inference-compute swing on same weights. 1B + 256-sample search > 405B baseline (Tsinghua Compute-Optimal Test-Time).

N8 · Bill_4

Vendor/Epoch discrepancy row

Internal reconstruction row pending source-card verification. DeepSeek, Llama-4, Epoch, and leaderboard-variant observations need row-by-row public URLs before the page can claim exact median or p95 discrepancy values.

N9 · Bill_10

Western/China disclosure-asymmetry row

Internal ledger row pending public-source-card verification. The page no longer presents the 17% vs 100% split as externally verified until each vendor disclosure observation is manifest-backed.

N10 · Bill_18 NEW

Threshold-policy lifetime: 4-15 months

EO 14110 = 15 months (Oct 2023 → Jan 2025). BIS Diffusion Framework = 4 months (Jan→May 2025) — shortest documented across all 7 ledgers.

N11 · Bill_15

CSET: 30K-100K H100-equivalents annually

BIS export-control bypass whack-a-mole. H800 → H20 → cloud-rental progression. Each tightening generates new bypass within 3-9 months. 50K via cloud-rental.

N12 · Cross-ledger

Self-validation tautology — 7 ledgers

Now confirmed across 7 ledgers: QA Bill_4 (XEB) ↔ Mech Interp Bill_5 (activation patching) ↔ Lattice cost-fudges ↔ Capability Bill_10 ↔ Inference-time Safety Bill_10 + Bill_18 ↔ Compute Governance Bill_10 + Bill_18. The pattern is domain-invariant.

§04

Falsification protocol

Public update committed within 7 days of any verified trigger of F7, F11, F14, or F17.

F7 · ★ Survives all six audits

Trigger: a compute-governance claim that survives F1-F6 with independent third-party verification within 6 months

F11 · ★ Distillation-resistant capability

Trigger: a capability with ≥10x compute ratio at frontier that is empirically distillation-resistant

F14 · ★ Cross-jurisdiction harmonization

Trigger: ≥3 jurisdictions converge on a single FLOPs methodology + capability-tier mapping

F17 · ★ Threshold achieves stated purpose

Trigger: empirical evidence that a compute threshold deters / measures / gates capability tier as designed

F18 · Threshold-policy lifetime NEW

Trigger: compute-governance threshold with disclosed expected revision/repeal half-life ≥36 months

F19 · Distilled-cousin half-life NEW

Trigger: frontier-LLM release with empirical distilled-cousin half-life ≥12 months

Live alerts (triggered watch-list): EU AI Office systemic-risk model registry · BIS Compute Reporting · UK AISI capability-eval · Epoch AI compute trends · Pilz-Heim distillation-circumvention line · Stanford CRFM HELM compute panel · Hammond-Aarne-Anderljung methodology · OECD AIGO · America's AI Action Plan trajectory.

§05

Method at a glance

Threat modelA compute-threshold-derived capability claim OR threshold-as-mitigation claim that survives compute-vs-capability decoupling + distillation circumvention + test-time compute shadow + training-FLOP transparency + distributed-training aggregation + compute-cost-as-deterrent audit on a frontier LLM under the 2024-2026 regulatory regime.

Deep loops8 sweeps × 5–10 parallel research agents per sweep × 1 batch round.

Sources surveyedEU AI Act + GPAI Code (53 papers), US EO 14110/14179 + BIS export controls (60), FLOPs methodology + Epoch AI (64), distillation circumvention (65), test-time compute (66), international governance (57), vendor compute disclosure (68), negative-results (57).

ClassifierRegex rule engine. v0.2 with 61 hand-curated benchmark cases at gate-accuracy 1.000 / bill-recall 1.000.

Empty-space testFour signature bills (7, 11, 14, 17) predeclared empty BEFORE batch 1 sweeps. After 421 papers across 8 sweeps, all four remain empty. Bills 18-19 (threshold-policy lifetime + distilled-cousin half-life) promoted to v0.2 from batch 1 evidence.

Cross-ledger couplingSelf-validation tautology pattern confirmed across 7 ledgers as domain-invariant. Temporal-trajectory pattern confirmed across 3 ledgers structurally (Capability Benchmarks Bill_19 / Inference-time Safety Bill_2 / Compute Governance Bill_19).

ReproducibilityAll scripts, JSONs, and wiki are public. Run order: bill_classifier.py --benchmark → ledger populator → atlas review pipeline.

§06