# Robotics / Embodied AI Ledger — Bills Draft (v0.1)

> Stage 2 (BILLS). **13 candidate bills + 6 meta-costs + 3 escape gates**, with
> **3 ★ predicted-empty** at positions 5, 8, 11.
>
> Predeclared bridge test: Bill 4 (causally-faithful grounded-reward) is NOT
> ★ — anticipated to surface positive triggers, which would falsify
> cross_ledger_bridges Bridge 1's unrestricted form.

## The thirteen bills

| # | Bill | What gets paid | ★ |
|---:|---|---|:---:|
| 1 | Demonstration-distribution shift audit | Cherry-picked vs everyday distribution; success rate on novel objects/scenes outside demo set. | |
| 2 | Sim-to-real-perception-gap audit | Camera-format / lighting / friction / contact-dynamics gap explicitly measured. | |
| 3 | Hardware-cost / fleet-cost transparency | Per-robot capex, per-trial wall-clock, total hardware-hours disclosed. | |
| 4 | **Causally-faithful grounded-reward mechanism** | RL-from-physics-grounded-reward shows intervention-validated causal mechanism. *Bridge test: anticipated to fire positively.* | |
| 5 | **★ Sim-to-real generalization** | ≥80% of demonstrated capability retained on real hardware. Predicted empty. | ★ |
| 6 | Long-horizon plan stability audit | ≥10-step task chain holds; failure cascade rate < 30%. | |
| 7 | Strong-baseline classical-robotics comparison | VLA capability claim beats MPC / Whole-Body-Control / classical-IK baseline at equivalent compute. | |
| 8 | **★ Embodiment-cross-platform transfer** | Same policy transfers Google → Boston Dynamics → Apptronik → 1X with ≤30% degradation. Predicted empty. | ★ |
| 9 | Held-out scene / novel-object generalization | ≥30% novel objects in test set, no overlap with training scenes. | |
| 10 | Vendor-self-eval independence | Reproduced by METR / Apollo / DARPA / NIST robotics-eval / university lab. | |
| 11 | **★ Universal task-set coverage** | All 5 sub-tasks {manipulation, locomotion, navigation, multi-step planning, human-interaction} above clean threshold. Predicted empty. | ★ |
| 12 | Safety / collision / human-robot audit | Real-world incident rate, near-miss disclosure, safety-bound enforcement. | |
| 13 | Tele-operation / human-in-the-loop decomposition | Autonomous vs tele-op contribution explicitly separated. (Figure / Optimus skepticism.) | |

## Six meta-costs

| # | Meta-cost | Description |
|---|---|---|
| M1 | Pre-2024 (RT-1 / RoboCat era) | Toy regime for the 2024-2026 frontier. |
| M2 | Single-task-only | Single manipulation / locomotion task. |
| M3 | Single-embodiment-only | One robot hardware only, no cross-platform. |
| M4 | Sim-only-no-real-hardware | No real-world deployment data. |
| M5 | Demonstration-cherry-pick | Cherry-picked successful demos only. |
| M6 | Implementation-specific | Specific gripper / sensor / actuator required. |

## Three escape gates: G1 methodology / G2 negative-result / G3 theoretical.

## Iteration plan (8 sweeps)

- 801: Frontier VLA model cards (RT-2, RT-X, Helix/Figure 03, OpenVLA, π0/π0.5, GR00T, Gato-2, Optimus, Apollo)
- 802: Sim-to-real audits + perception-gap papers
- 803: Cross-embodiment transfer literature (RT-X arm + leg, Octo, Open X-Embodiment)
- 804: Manipulation benchmarks (LIBERO, RoboCasa, ManiSkill, RoboArena, Open X-Embodiment)
- 805: Locomotion + navigation (ANYmal, Spot, Mobile ALOHA, humanoid locomotion)
- 806: Autonomous driving (Waymo, Wayve, Tesla FSD V14, CARLA leaderboards, nuScenes)
- 807: Tele-op decomposition + Figure / Optimus skepticism / Mobile ALOHA human-in-loop
- 808: METR / Apollo / DARPA / NIST robotics-eval independent audits + safety / collision data

## Bridge-test commitments

- **Bill 4 (causally-faithful grounded-reward)** is predicted to surface
  positive triggers. If it does, this FALSIFIES cross_ledger_bridges
  Bridge 1's unrestricted form — and we publish the falsification within
  7 days per our pre-commitment.
- **Bills 5, 8, 11 ★** are predicted empty. If any surfaces a clean
  trigger, the public update timeline applies.
