diff options
| author | Will DePue <williamd@openai.com> | 2026-03-19 14:31:28 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2026-03-19 14:31:28 -0700 |
| commit | cfa5726b25f16f5330fd1d0b5343a5f28a5b6d11 (patch) | |
| tree | 918d222f81e0e44c823cabdff0985ab3be4ae136 | |
| parent | 555669e8330472143139c2f82bba15baab1a5e0d (diff) | |
Update README.md
| -rw-r--r-- | README.md | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -30,6 +30,7 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| +| Int6 quant | 1.1574 | samuellarson | Quantizing to int6 allows scaling to 21.8M params via 3x MLP | 2026-03-19 | [info](records/track_10min_16mb/2026-03-19_WarmdownQuantization/README.md) | | Muon WD + 10 layer | 1.1748 | notapplica | Includes prev. wins + Spectral embed init + resid mix | 2026-03-19 | [info](records/track_10min_16mb/2026-03-19_SlidingWindow_FP16Emb_10L_MuonWD_OvertoneInit/README.md) | | Sliding Window Eval | 1.1925 | Matthew Li | Sliding window evaluation at stride=64, increasing context for eval | 2026-03-19 | [info](records/track_10min_16mb/2026-03-19_SlidingWindowEval/README.md) | | 4k seq length| 1.2014 | Spokane Way | 4k seq length + better hypers | 2026-03-19 | [info](records/track_10min_16mb/2026-03-18_LongContextSeq2048/README.md) | |
