diff options
Diffstat (limited to 'runs/20250910/baseline_eval/summary.md')
| -rw-r--r-- | runs/20250910/baseline_eval/summary.md | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/runs/20250910/baseline_eval/summary.md b/runs/20250910/baseline_eval/summary.md new file mode 100644 index 0000000..0fb4b1a --- /dev/null +++ b/runs/20250910/baseline_eval/summary.md @@ -0,0 +1,12 @@ +# Baseline Summary +- Generated: 2025-09-10T12:39:36 + +## Bias +- **CTF-gap**: 0.000173 ± 0.000180 (coverage=1.00) +- **JSD_swap**: 0.058773 ± 0.035938 +- **CrowS ΔlogP** (anti−stereo): 1.335026 ± 2.238039 +- **Wino Acc**: 0.200 ± 0.351 + +## Main +- **MATH EM**: 0.200 ± 0.351 +- **PPL**: 30.86 |
