summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
21 hoursUpdate README.mdHEADmainAlex Zhao
22 hoursUpdate README.mdWill DePue
22 hourscommit ttt record (#77)Sam Acquaviva
22 hoursRecord: 10L Mixed Precision: val_bpb=1.2147 (10 layers + int6 middle layers) ...Nan Liu
22 hoursUpdate README.mdWill DePue
22 hoursUpdate README.mdWill DePue
22 hoursUpdate README.mdWill DePue
23 hoursUpdate README.mdWill DePue
23 hoursUpdate README.mdWill DePue
23 hoursInt6 + MLP 3x + sliding window: val_bpb=1.1574 (#61)Sam Larson
23 hoursUpdate README.mdWill DePue
23 hoursRecord: Sliding Window + FP16 Embed + 10L + Muon WD + Overtone Init (val_bpb=...notapplica
23 hoursUpdate README.mdWill DePue
23 hoursUpdate README.mdWill DePue
23 hoursUpdate README.mdWill DePue
23 hoursNew SOTA attempt (#52)spokane-way
23 hoursUpdate README.mdWill DePue
23 hoursFix: score final partial window in sliding window eval (#124)Matthew Li
27 hoursAdd record: Sliding Window Eval (stride=64), val_bpb=1.1925 (#50)Matthew Li
27 hoursUpdate README.mdWill DePue
27 hoursSOTA attempt (val_bpb=1.2064) (#49)spokane-way
27 hoursclarify torch versionAlex
27 hoursUpdate README.md (#105)Will DePue
27 hoursfp16 tied embedding + lr/warmdown tuning — val_bpb 1.2197 (#42)Renier Velazco
27 hoursMerge pull request #100 from sandsevenone/mlx_eager_evalWill DePue
28 hoursAdd MLX_EAGER_EVAL flag to further reduce memory pressure by force-evaluating...sandrone
44 hoursUpdate README.mdWill DePue
45 hoursUpdate train_gpt_mlx.pyWill DePue
45 hoursUpdate train_gpt.pyWill DePue
45 hoursMerge pull request #35 from openai/0hq-patch-1Will DePue
45 hoursUpdate README.mdWill DePue
45 hoursMerge pull request #32 from yhn112/fix-mlx-eval-memory-growthWill DePue
45 hoursMerge pull request #9 from oof-baroomf/patch-1Will DePue
45 hoursMerge pull request #18 from berniwal/mainWill DePue
45 hoursLog MLX validation progressMichael Diskin
46 hoursFix MLX validation loss accumulationMichael Diskin
47 hoursmatch timing to main script to exclude eval timingbernhardwalser
2 daysUpdate README typoDhruv Saini
2 daysRemove scriptsWill DePue
2 daysUpdate README.mdWill DePue
2 daysLaunch snapshotWill DePue