summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
30 hoursUpdate README.mdWill DePue
30 hoursUpdate README.mdWill DePue
30 hoursUpdate README.mdWill DePue
30 hoursUpdate README.mdWill DePue
30 hoursInt6 + MLP 3x + sliding window: val_bpb=1.1574 (#61)Sam Larson
30 hoursUpdate README.mdWill DePue
31 hoursRecord: Sliding Window + FP16 Embed + 10L + Muon WD + Overtone Init (val_bpb=...notapplica
31 hoursUpdate README.mdWill DePue
31 hoursUpdate README.mdWill DePue
31 hoursUpdate README.mdWill DePue
31 hoursNew SOTA attempt (#52)spokane-way
31 hoursUpdate README.mdWill DePue
31 hoursFix: score final partial window in sliding window eval (#124)Matthew Li
34 hoursAdd record: Sliding Window Eval (stride=64), val_bpb=1.1925 (#50)Matthew Li
34 hoursUpdate README.mdWill DePue
34 hoursSOTA attempt (val_bpb=1.2064) (#49)spokane-way
34 hoursclarify torch versionAlex
34 hoursUpdate README.md (#105)Will DePue
34 hoursfp16 tied embedding + lr/warmdown tuning — val_bpb 1.2197 (#42)Renier Velazco
35 hoursMerge pull request #100 from sandsevenone/mlx_eager_evalWill DePue
35 hoursAdd MLX_EAGER_EVAL flag to further reduce memory pressure by force-evaluating...sandrone
2 daysUpdate README.mdWill DePue
2 daysUpdate train_gpt_mlx.pyWill DePue
2 daysUpdate train_gpt.pyWill DePue
2 daysMerge pull request #35 from openai/0hq-patch-1Will DePue
2 daysUpdate README.mdWill DePue
2 daysMerge pull request #32 from yhn112/fix-mlx-eval-memory-growthWill DePue
2 daysMerge pull request #9 from oof-baroomf/patch-1Will DePue
2 daysMerge pull request #18 from berniwal/mainWill DePue
2 daysLog MLX validation progressMichael Diskin
2 daysFix MLX validation loss accumulationMichael Diskin
2 daysmatch timing to main script to exclude eval timingbernhardwalser
2 daysUpdate README typoDhruv Saini
2 daysRemove scriptsWill DePue
2 daysUpdate README.mdWill DePue
2 daysLaunch snapshotWill DePue