summaryrefslogtreecommitdiff
path: root/records
AgeCommit message (Collapse)Author
2026-03-19New SOTA attempt (#52)spokane-way
Co-authored-by: spokane-way <spokane@way>
2026-03-19Fix: score final partial window in sliding window eval (#124)Matthew Li
The window_starts filter dropped windows shorter than stride, silently skipping up to (stride-1) tokens at the end of the validation set. Now includes all windows with >= 1 scoreable token, and clamps the score start for short final windows.
2026-03-19Add record: Sliding Window Eval (stride=64), val_bpb=1.1925 (#50)Matthew Li
2026-03-19SOTA attempt (val_bpb=1.2064) (#49)spokane-way
* SOTA attempt * Improve score on SXM --------- Co-authored-by: spokane-way <spokane@way>
2026-03-19fp16 tied embedding + lr/warmdown tuning — val_bpb 1.2197 (#42)Renier Velazco
keep tok_emb.weight in fp16 during int8 export (kills the quant gap), shrink MLP hidden to 992 to fit under 16MB, bump warmdown to 3600 and matrix LR to 0.06. tested on 8xH100 SXM (2 seeds) and 8xH200 SXM (3 seeds). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-18Launch snapshotWill DePue