Home CHRONOS Agent Dreaming Ledgers Training Signal About
Loading verification feed
Method · Stage 3.5

Falsification ledgers are built to make the hard part public: what would count as a real breach, which evidence was checked, and which claims survived verification.

DeclareWrite the bills before the sweep.
SweepSearch the frontier corpus and collect candidates.
VerifyStage 3.5 checks citations before any breach or finding is marked public-ready.
Locked v1.16 cryptography

Factorization Atlas

504-paper survey of integer-factorization closure across classical, quantum, and post-quantum threat models. Current reference ledger for the method.

504
Papers
13+6
Bills + meta
3
★ Empty
Bills 6, 7, 8 have no verified hits in the current 504-paper corpus · external review pending
Locked v0.1 robotics

Robotics / Embodied AI Ledger

312-paper survey of frontier embodied-AI claims (RT-2/X, Helix/Figure 03, OpenVLA, π0/0.5, GR00T, Optimus, Waymo, Wayve, Apollo, 1X). 12 sweeps + 4 verification.

312
Papers
13
Bills
3
★ Empty
★ Bills 5, 8, 11 HOLD after verification killed 9/9 hallucinated breach IDs · Bill 4 KILLED · Bridge 1 untested by this corpus
Locked v0.2 capability

Multilingual / Low-Resource Ledger

299-paper survey of low-resource and multilingual capability claims. ALL 3 ★ predicted-empty bills hold (0/33, 0/46, 0/145). 75% rebuttal density.

299
Papers
13
Bills
3
★ Empty
All 3 ★ HOLD: 0/33, 0/46, 0/145 · 75% rebuttal density
Locked v0.2 capability

RAG / Retrieval Ledger

247-paper survey of retrieval-augmented generation closure. ALL 3 ★ empty. Bill 7 PROFOUNDLY RESCOPED to commercialization-vs-research axis.

247
Papers
12
Bills
3
★ Empty
All 3 ★ EMPTY · Bill 7 rescoped commercialization-vs-research
Locked v0.2 capability

Multimodal Generation Ledger

377-paper survey of frontier image / video / audio generation. Strong B7/B8 bipolar signal: 78 closed Bill 9 vs 74 open Bill 12.

377
Papers
13
Bills
3
★ Empty
All 3 ★ EMPTY (0/8, 0/9, 0/39) · strong closed-vs-open bipolar split
Locked v0.2 capability

Scientific Discovery Ledger

301-paper survey of AI-driven scientific-discovery claims. 2/3 ★ empty. Bill 4 PARTIAL: 10 autonomous-lab triggers as predicted. Bill 8 ★ EMPTY across 3 substrates.

301
Papers
13
Bills
2+1
★ Empty + PARTIAL
Bill 4 PARTIAL · 10 autonomous-lab triggers · Bill 8 empty across 3 substrates
Locked v0.2 capability

Hardware Inference Ledger

291-paper survey across vLLM / SGLang / Groq / Cerebras / Triton inference stacks. Purest 0/N signal in the corpus: 0/34, 0/38, 0/20.

291
Papers
13
Bills
3
★ Empty
STRONG 0/N · vLLM/SGLang vs Groq/Cerebras = strong B7/B8 separation
Populated cryptography

Lattice Cryptography

635-paper post-quantum lattice ledger. Kyber / Dilithium / Falcon under closure. Bills tracking ring-LWE and SIS hardness assumptions.

~635
Papers
13
Bills
2
★ Empty
2 ★ holding · sweep ongoing pending Stage 3.5
Populated capability

Quantum Advantage

275-paper survey of quantum-supremacy / advantage claims. Random circuit sampling, boson sampling, Shor scaling.

~275
Papers
13
Bills
2
★ Empty
Quantum-classical boundary tracked across 11 platforms
Populated capability

Capability Benchmarks

280-paper survey of frontier capability claims. MMLU, MMMU, ARC-AGI, FrontierMath, LiveCodeBench saturation curves.

~280
Papers
13
Bills
2
★ Empty
Anti-saturation = only working closure across the corpus
Populated governance

Compute Governance

280-paper survey of compute-governance disclosure. Western 17% / Chinese 100% inversion documented. BIS lifetime, NIST AI RMF, EU AI Act timelines.

~280
Papers
13
Bills
2
★ Empty
Sign-flip on "China = closed/risky, US = open/safe" framing
Populated safety

Inference-Time Safety

280-paper survey of inference-time safety / jailbreak / refusal closure. ITS patch lifecycle: 30d / 36h. Bill 14 ★: defense is property of deployment surface.

~280
Papers
13
Bills
2
★ Empty
★ Bill 11 + Bill 14 holding · cross-surface mitigation gap
Populated mechanism

Mech Interp

280-paper survey of mechanistic interpretability claims. Sparse autoencoders, feature circuits, causal abstraction, faithfulness. Bill 11 ★ evidence-bearing for Bridge 1.

~280
Papers
13
Bills
2
★ Empty
★ Bill 11 (causally-faithful mechanism) anchors Bridge 1
Draft v0.2 safety

RL-from-Rewards Ledger

417-paper survey of RLHF / DPO / Constitutional AI / Self-Rewarding alignment claims. 8 sweeps + Stage 3.5 verification.

417
Papers
13+7
Bills + meta
4
★ Empty
★ Bills 6, 10, 12, 13 EMPTY · Sleeper Agents + Apollo Scheming + Magpie + Tülu 3 verified · 60% sweep-agent hallucination caught at Stage 3.5
Populated capability

Arena Attack

222-record forensic survey of published math 2020-2026 against 15 EinsteinArena problems. AlphaEvolve = cross-domain lingua franca. 6 artifact-bounded, 4 published-tight.

222
Records
12+6
Bills + meta
2
★ Empty
Bill 4 (asymmetric Heilbronn n=11) + Bill 7 (Li-Yip CRT) confirmed empty
Populated mechanism

Cross-Ledger Bridges

111-record cross-ledger meta-audit — the harness pointed at itself. Bills 7★, 9★, and 12★ were predeclared empty before the audit. Seven bridges surfaced; batch-3 checks confirmed 21/21 priority claims.

111
Records
7+2
Bridges (+B8/B9)
3
★ Empty
21/21 inheritance confirmed across MM Gen + Sci Disc + HW Inference
Draft v0.1 · Stage 3.5 physics

Spacetime Discreteness

388-paper quantum-gravity discreteness survey (LQG / spinfoam / CDT / causal sets / asymptotic safety / GFT / holographic / emergent gravity). The first physics falsification ledger — 4 ★ bills because the discreteness-prediction problem must independently pay both internal-consistency AND external-distinguishability closures.

388
Papers
13+6
Bills + meta
4
★ Empty
★ Bills 8, 10, 11, 13 HOLD EMPTY confirmed by Stage 3.5 (2026-05-15) · 20/20 priority pool hallucinated · empty-space hypothesis strengthened
Bills draft capability

Agentic Tool Use Ledger

SWE-bench, Cybench, browser-use, code-interpreter agents. Bills predeclared, sweep pending.

Sweep pending
13
Bills drafted
2
★ Predicted
Sweep + Stage 3.5 batch in queue
Bills draft bio

Bio / Protein Ledger

AlphaFold, RosettaFold, ESMFold, structural biology + drug-discovery overclaims. Bills tracking generative-model novelty closure.

Sweep pending
13
Bills drafted
2
★ Predicted
Bridge to scientific_discovery anticipated
Bills draft governance

Open-Weight Ledger

Llama 4, Qwen3-MoE 235B, Hunyuan-Large, Mistral. Apache 2.0 ≥30B closures and distillation portability. Bill 8 ★ evidence-bearing for Bridge 3.

Sweep pending
13
Bills drafted
2
★ Predicted
★ Bill 8 (cross-surface mitigation) anchors Bridge 3
Bills draft reasoning

Reasoning / CoT Ledger

o1, o3, DeepSeek-R1, Sky-T1, reflection, self-consistency. Bill 6 ★ — causally-faithful reasoning trace closure.

Sweep pending
13
Bills drafted
2
★ Predicted
Bridge 1 anchor — causally-faithful trace
Bills draft capability

Scaling Laws Ledger

Chinchilla, Kaplan, emergent abilities, Mamba/SSM vs dense, R1-Distill 100–1000×. Bill 11 ★ — scaling-portability closure.

Sweep pending
13
Bills drafted
2
★ Predicted
★ Bill 11 anchors Bridge 4 (scaling-portability)
Bills draft capability

Vision-Language Ledger

CLIP, LLaVA, Qwen-VL, Sora, Veo, Imagen, PixArt. Bill 4 ★ (causally-faithful mechanism) + Bill 18 (cross-surface).

Sweep pending
18
Bills drafted
2
★ Predicted
Bridge 1 + Bridge 3 cross-surface anchor

Seven bridges from the cross-ledger self-audit — the harness pointed at itself. Bills 7★, 9★, and 12★ were predeclared empty before audit. Three evidence-bearing, three weakened, one untested.

1
Causally-faithful mechanism — empty across most LLM domains; untested by robotics

Mech Interp Bill 11★, ITS Bill 11★, Reasoning Bill 6★, VLM Bill 4★, Scaling Laws Bill 5★, Agentic Bill 4★, Bio Bill 4★ all hold empty across 2,000+ LLM-domain papers. Robotics_embodied corpus support evaporated when verification killed all 4 Bill_4 grounded-reward IDs (2026-05-15).

untested
2
Closure cycle compressed to 30–100 days

Vendor-claim half-life 73d · ITS patch 30d / 36h · distilled-cousin 3.4mo · Sky-T1 reproduces o1-preview in 2wk · BIS 4mo lifetime · ARC-AGI v1→v2 3mo. Reported as a 30–100 day range.

operational
3
Capabilities transfer cross-surface; mitigations don't

ITS Bill 14★ + Open-weight Bill 8★ + VLM Bill 18 + Agentic Bill 11. Lermen-Rimsky 10× cheaper to undo safety than to install it. Defense mitigations are a property of the deployment surface, not the model.

evidence-bearing
4
Distillation = architecture-portability = scaling-portability

Open-weight Bill 5★ + Scaling Laws Bill 11★ + Compute Governance Bill 11★. Halevy-Heim-Pilz 0/14 resistant; Mamba2 dense fails 0.06–0.11 on SSM; R1-Distill 100–1000× lower compute. No architectural moat — capability is fluid, only training-data novelty is sticky.

evidence-bearing
5
"0/N" pattern recurs across forensic researchers

Anand-Goyal unified-VLM 0/9 · Anand-Bommasani cross-organism 0/8 · Anand-Rein unified-agent 0/9 · Halevy-Heim-Pilz distillation 0/14 · IBBIS synthesis-screened 0/4 · Yang-Bommasani cross-mixture 0/9.

evidence-bearing
6
Anti-saturation is the only working closure

Across 7 ledgers, anti-saturation is the only Bill that fires positive. Iterative reframing (ARC v1→v2→v3, MMMU→MMMU-Pro, FrontierMath Tier-1→4, LiveCodeBench monthly, Cybench Pro held-out) is empirically the only audit primitive keeping pace with the 30–100 day closure cycle.

evidence-bearing
7
Western-vs-Chinese open-weight inversion

China-domiciled vendors disclose 100% (DeepSeek, Alibaba, 01.AI all 8/8 fields); Western vendors disclose 17%. Frontier Apache 2.0 ≥30B includes Llama 4, Qwen3-MoE 235B, Hunyuan-Large. The "China = closed/risky, US = open/safe" framing would be sign-flipped if updated to current data.

weakened

Synthesis. 30+ ★ predicted-empty bills holding across 8,600+ papers is the evidence-bearing claim; the cross-ledger bridges are the interesting result. Discipline: predeclare the empty-space bills before the audit, ship verification before any breach. Read full synthesis →