summaryrefslogtreecommitdiff
path: root/runs/20250910/baseline_eval/summary.md
diff options
context:
space:
mode:
authorYuren Hao <yurenh2@timan108.cs.illinois.edu>2025-09-10 12:41:28 -0500
committerYuren Hao <yurenh2@timan108.cs.illinois.edu>2025-09-10 12:41:28 -0500
commit9d5f2379ac25b4b58e2600544f61172dbb15b67a (patch)
tree17a945ad194f50523c9ef25011cc13db22285bce /runs/20250910/baseline_eval/summary.md
parent5bfd92f6c28530482a765252a4497cfedacad25a (diff)
fix ctf
Diffstat (limited to 'runs/20250910/baseline_eval/summary.md')
-rw-r--r--runs/20250910/baseline_eval/summary.md12
1 files changed, 12 insertions, 0 deletions
diff --git a/runs/20250910/baseline_eval/summary.md b/runs/20250910/baseline_eval/summary.md
new file mode 100644
index 0000000..0fb4b1a
--- /dev/null
+++ b/runs/20250910/baseline_eval/summary.md
@@ -0,0 +1,12 @@
+# Baseline Summary
+- Generated: 2025-09-10T12:39:36
+
+## Bias
+- **CTF-gap**: 0.000173 ± 0.000180 (coverage=1.00)
+- **JSD_swap**: 0.058773 ± 0.035938
+- **CrowS ΔlogP** (anti−stereo): 1.335026 ± 2.238039
+- **Wino Acc**: 0.200 ± 0.351
+
+## Main
+- **MATH EM**: 0.200 ± 0.351
+- **PPL**: 30.86