| Age | Commit message (Collapse) | Author | |
|---|---|---|---|
| 36 hours | Update README.md (#105) | Will DePue | |
| 36 hours | fp16 tied embedding + lr/warmdown tuning — val_bpb 1.2197 (#42) | Renier Velazco | |
| keep tok_emb.weight in fp16 during int8 export (kills the quant gap), shrink MLP hidden to 992 to fit under 16MB, bump warmdown to 3600 and matrix LR to 0.06. tested on 8xH100 SXM (2 seeds) and 8xH200 SXM (3 seeds). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> | |||
| 36 hours | Merge pull request #100 from sandsevenone/mlx_eager_eval | Will DePue | |
| Use eager mx.eval() to fix running train script on 16GB Mac devices | |||
| 37 hours | Add MLX_EAGER_EVAL flag to further reduce memory pressure by ↵ | sandrone | |
| force-evaluating the graph after each sub-batch step | |||
| 2 days | Update README.md | Will DePue | |
| 2 days | Update train_gpt_mlx.py | Will DePue | |
| 2 days | Update train_gpt.py | Will DePue | |
| 2 days | Merge pull request #35 from openai/0hq-patch-1 | Will DePue | |
| Update README.md | |||
| 2 days | Update README.md | Will DePue | |
| 2 days | Merge pull request #32 from yhn112/fix-mlx-eval-memory-growth | Will DePue | |
| Fix MLX multi-batch validation memory growth | |||
| 2 days | Merge pull request #9 from oof-baroomf/patch-1 | Will DePue | |
| Update README typo | |||
| 2 days | Merge pull request #18 from berniwal/main | Will DePue | |
| MLX Timing Mismatch with Main Script | |||
| 2 days | Log MLX validation progress | Michael Diskin | |
| 2 days | Fix MLX validation loss accumulation | Michael Diskin | |
| 2 days | match timing to main script to exclude eval timing | bernhardwalser | |
| 2 days | Update README typo | Dhruv Saini | |
| 2 days | Remove scripts | Will DePue | |
| 3 days | Update README.md | Will DePue | |
| 3 days | Launch snapshot | Will DePue | |
