index
:
parameter-golf.git
main
Unnamed repository; edit this file 'description' to name the repository.
Ubuntu
summary
refs
log
tree
commit
diff
log msg
author
committer
range
Age
Commit message (
Expand
)
Author
29 hours
Record: 10L Mixed Precision: val_bpb=1.2147 (10 layers + int6 middle layers) ...
Nan Liu
29 hours
Update README.md
Will DePue
30 hours
Update README.md
Will DePue
30 hours
Update README.md
Will DePue
30 hours
Update README.md
Will DePue
30 hours
Update README.md
Will DePue
30 hours
Int6 + MLP 3x + sliding window: val_bpb=1.1574 (#61)
Sam Larson
30 hours
Update README.md
Will DePue
31 hours
Record: Sliding Window + FP16 Embed + 10L + Muon WD + Overtone Init (val_bpb=...
notapplica
31 hours
Update README.md
Will DePue
31 hours
Update README.md
Will DePue
31 hours
Update README.md
Will DePue
31 hours
New SOTA attempt (#52)
spokane-way
31 hours
Update README.md
Will DePue
31 hours
Fix: score final partial window in sliding window eval (#124)
Matthew Li
34 hours
Add record: Sliding Window Eval (stride=64), val_bpb=1.1925 (#50)
Matthew Li
34 hours
Update README.md
Will DePue
34 hours
SOTA attempt (val_bpb=1.2064) (#49)
spokane-way
34 hours
clarify torch version
Alex
34 hours
Update README.md (#105)
Will DePue
34 hours
fp16 tied embedding + lr/warmdown tuning — val_bpb 1.2197 (#42)
Renier Velazco
35 hours
Merge pull request #100 from sandsevenone/mlx_eager_eval
Will DePue
35 hours
Add MLX_EAGER_EVAL flag to further reduce memory pressure by force-evaluating...
sandrone
2 days
Update README.md
Will DePue
2 days
Update train_gpt_mlx.py
Will DePue
2 days
Update train_gpt.py
Will DePue
2 days
Merge pull request #35 from openai/0hq-patch-1
Will DePue
2 days
Update README.md
Will DePue
2 days
Merge pull request #32 from yhn112/fix-mlx-eval-memory-growth
Will DePue
2 days
Merge pull request #9 from oof-baroomf/patch-1
Will DePue
2 days
Merge pull request #18 from berniwal/main
Will DePue
2 days
Log MLX validation progress
Michael Diskin
2 days
Fix MLX validation loss accumulation
Michael Diskin
2 days
match timing to main script to exclude eval timing
bernhardwalser
2 days
Update README typo
Dhruv Saini
2 days
Remove scripts
Will DePue
2 days
Update README.md
Will DePue
2 days
Launch snapshot
Will DePue