diff options
| author | Will DePue <williamd@openai.com> | 2026-03-19 15:32:27 -0700 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2026-03-19 15:32:27 -0700 |
| commit | 5e29bfd388b5416dee31c1d5079eebf4ee5c310d (patch) | |
| tree | be5983a9e35b1bd49b07ada0e144c7802574ca18 | |
| parent | bd2463aaa21fec47f643c593d86d4dd385d474e9 (diff) | |
Update README.md
| -rw-r--r-- | README.md | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -32,6 +32,7 @@ Happy training! |-----|------:|--------|---------|------|------| | Muon WD + 10 layer | 1.1748 | notapplica | Includes prev. wins + Spectral embed init + resid mix | 2026-03-19 | [info](records/track_10min_16mb/2026-03-19_SlidingWindow_FP16Emb_10L_MuonWD_OvertoneInit/README.md) | | Sliding Window Eval | 1.1925 | Matthew Li | Sliding window evaluation at stride=64, increasing context for eval | 2026-03-19 | [info](records/track_10min_16mb/2026-03-19_SlidingWindowEval/README.md) | +| Lora TTT | 1.1928 | samacqua | Test-time training with LORAs | 2026-03-19 | [info](records/track_10min_16mb/2026-03-17_LoRA_TTT/README.md) | | 4k seq length| 1.2014 | Spokane Way | 4k seq length + better hypers | 2026-03-19 | [info](records/track_10min_16mb/2026-03-18_LongContextSeq2048/README.md) | | 2048 seq length | 1.206 | Spokane Way | 2048 seq length (train + val) | 2026-03-18 | [info](records/track_10min_16mb/2026-03-18_LongContextSeq2048/README.md) | | int6 mixed precision | 1.2147 | Nan Liu | 10 layers, mixed int8/int6 | 2026-03-18 | [info](records/track_10min_16mb/2026-03-19_10L_MixedPrecision/README.md) | |
