Fill in tables 1-3 + generate figures 2/4/5 from existing data

2026-04-08T09:46:59+00:00

Tables filled with real values:
  Table 1: 5-method audit (3-seed mean ± std for acc, headline Γ, verdict)
  Table 2: 4-condition mode 2 validation (cos and ρ values from existing
           checkpoint measurements)
  Table 3: protocol thresholds (50×, 1e-7, 0.30, 2pp)

Figures generated from existing data:
  fig2_decision_utility.pdf: 5×7 verdict heatmap from
    results/protocol_audit/ablation_decision_utility.json
  fig4_penalty_rescue.pdf: 3-panel — trajectory + cos/ρ bars + 2×2 acc
    from snapshot_evolution_v2 + dfa_residual_penalty + bp_with_penalty
  fig5_cross_arch_summary.pdf: 5×4 BP/DFA verdict matrix across
    architectures

Compiles to 8 pages with all tables/figures rendered. §1-§7 main body
still has only paragraph topic sentences (TODO: per-section prose
filling via codex). Figure numbering is wrong (codex put figures in
section order not numerical order — need fixing).

v2 skeleton from round 25: section structure now matches round 23

2026-04-08T09:43:39+00:00

Round 24's skeleton had 3 deviations from round 23 redo:
  - Made §3 'Diagnostic Protocol' instead of 'Failure Mode 1'
  - Collapsed Mode 1 + Mode 2 into one §4
  - Added §6 'Reference Implementation' (was supposed to be dropped)

Round 25 fixed all three. New §3-§7 match round 23 redo exactly:
  §3 Failure Mode 1: Measurement Degeneracy
  §4 Failure Mode 2: Low Intrinsic Credit-Direction Quality
  §5 Intervention and Cross-Architecture Evidence
  §6 Recommended FA Evaluation Protocol
  §7 Discussion, Limits, Conclusion

Also added:
  - In-line bibliography with 12 \bibitem entries (Paleka, O'Bray, Jordan
    + FA literature) — citations resolve correctly now
  - Appendices A-G with actual prose content (not just headers)
  - 7-pitfall catalog with descriptions
  - Walk-back chain methodology paragraph
  - 7-validation summary table

Compiles to 9 pages with figures 1+3 inline (existing PNGs) and figures
2/4/5 as placeholder text PDFs (TODO: regenerate). Tables 1/2/3 still
have TODO placeholders for numerical values.

Next: fill in tables 1-3 with existing JSON data, generate figures 2/4/5
from existing data, then consult codex per-section for prose filling.

faeval.git/paper/figures/fig2_decision_utility.pdf, branch master

Fill in tables 1-3 + generate figures 2/4/5 from existing data

v2 skeleton from round 25: section structure now matches round 23