faeval.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	YurenHao0426 <Blackhao0426@gmail.com>	2026-04-08 10:24:24 -0500
committer	YurenHao0426 <Blackhao0426@gmail.com>	2026-04-08 10:24:24 -0500
commit	ebd0d1410492bfff1dc96db4cbb9bbbe97e7afe6 (patch)
tree	e47997a27ca319acf0b2d5beac3587358a383780 /results/confirmatory/clean_sparsity/synth_dfa_s123_a0.5_L8.json
parent	9c5dcd36d1c53073e6b42c2c85a0c47f2d3229c2 (diff)

Bib fix: correct titles for 3 E&D model papers (Paleka/O'Bray/Jordan)

Previous bibitems had paraphrased/invented titles for the 3 E&D-methodology exemplar papers cited in §1 and §7. The correct titles are: - Paleka et al. ICLR 2026: 'Pitfalls in Evaluating Language Model Forecasters' (not 'Pitfalls in evaluating model behavior: measurement, reporting, and interpretability failures') - O'Bray et al. ICLR 2022: 'Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions' (not 'Evaluation beyond leaderboard metrics: methodology matters') - Jordan et al. ICML 2020: 'Evaluating the Performance of Reinforcement Learning Algorithms' (not 'Evaluating machine learning: tests, cases, and expectations'). Also corrected first author 'Matt' -> 'Scott M.' Verified against codex round 23 memory which recorded the correct titles from the OpenReview/ICML URLs. Previous bibitems were hallucinated titles from earlier rounds and would have been a factual bug in the bibliography. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Diffstat (limited to 'results/confirmatory/clean_sparsity/synth_dfa_s123_a0.5_L8.json')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: