| Age | Commit message (Collapse) | Author |
|
Round 24's skeleton had 3 deviations from round 23 redo:
- Made §3 'Diagnostic Protocol' instead of 'Failure Mode 1'
- Collapsed Mode 1 + Mode 2 into one §4
- Added §6 'Reference Implementation' (was supposed to be dropped)
Round 25 fixed all three. New §3-§7 match round 23 redo exactly:
§3 Failure Mode 1: Measurement Degeneracy
§4 Failure Mode 2: Low Intrinsic Credit-Direction Quality
§5 Intervention and Cross-Architecture Evidence
§6 Recommended FA Evaluation Protocol
§7 Discussion, Limits, Conclusion
Also added:
- In-line bibliography with 12 \bibitem entries (Paleka, O'Bray, Jordan
+ FA literature) — citations resolve correctly now
- Appendices A-G with actual prose content (not just headers)
- 7-pitfall catalog with descriptions
- Walk-back chain methodology paragraph
- 7-validation summary table
Compiles to 9 pages with figures 1+3 inline (existing PNGs) and figures
2/4/5 as placeholder text PDFs (TODO: regenerate). Tables 1/2/3 still
have TODO placeholders for numerical values.
Next: fill in tables 1-3 with existing JSON data, generate figures 2/4/5
from existing data, then consult codex per-section for prose filling.
|
|
User rejected the v1 draft as '流水账实验报告' (sequential experiment
report). Round 22 + 23 redid the outline with E&D-genre prescription.
Saving v1 as v1_rejected.tex for reference. New main.tex will be
written from round 24 LaTeX skeleton (codex offered to provide it),
section by section, with codex check on each section's prose.
|
|
Compiled with tectonic (the only LaTeX engine on this server). Two
fixes needed:
1. Pass [numbers,compress] to natbib via PassOptionsToPackage so the
numerical bibliography style works
2. Use bibstyle 'abbrvnat' instead of 'plain' (compatible with natbib)
Result: 10-page PDF, ~7.5 content pages (well under 9-page E&D limit),
references on pages 8-9, appendices A-D on pages 9-10.
PDF uploaded to broker as 1843506b_main.pdf for user review.
|
|
The λ sweep is the strongest single piece of two-mode separation
evidence and doesn't require the early-epoch caveat. New §5.4 with
table showing:
λ=0: vanilla, both modes broken
λ=1e-4: mode 1 ALLEVIATED (||h_L||=2.4e4, ||g||=6.3e-7), mode 2 NOT
(cos -0.022, rho -0.004)
λ=1e-2: mode 1 alleviated, mode 2 partially (cos +0.16, rho +0.09)
λ=1e-1: slightly over-constrained (cos +0.13, rho +0.07)
The two modes have different intervention thresholds. §5.4 is now the
killer evidence; the early-epoch disambiguation in §5.3 becomes
supporting. Updated section summary to 'five validations'.
|
|
Title: 'Beyond Accuracy and Alignment: A Diagnostic Evaluation Protocol
for Feedback Alignment'
Structure (per round 21 prescription):
Abstract: 'broken because conflated' framing, 2 distinct modes named
§1 Introduction: discovery hook -> 2-mode framing -> contribution
§2 Related work
§3 Audit (the field-standard pair walks back nothing)
§4 The diagnostic protocol (4 diagnostics, calibrated thresholds,
decision-utility ablation, cross-architecture validation)
§5 Two distinct failure modes (mechanism, penalty rescue, direct
cosine measurement, hypothesis-disambiguation, capacity-cost control)
§6 Limitations
§7 Broader impacts
§8 Conclusion
Appendices: reproducibility, 7-pitfalls catalog, walk-back chain (4 step),
all 6 validations of the two-mode separation
Includes 4 result tables, ~10 references, structured as eandd track
double-blind submission. 760 lines of LaTeX, balanced environments
verified. Ready for compilation on a system with pdflatex.
Template: paper/neurips_2026.{sty,tex}, downloaded from official
NeurIPS 2026 source. checklist.tex also unzipped.
|