<feed xmlns='http://www.w3.org/2005/Atom'>
<title>parameter-golf.git/records/track_10min_16mb/2026-03-19_WarmdownQuantization, branch main</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/parameter-golf.git/'/>
<entry>
<title>Int6 + MLP 3x + sliding window: val_bpb=1.1574 (#61)</title>
<updated>2026-03-19T21:28:57+00:00</updated>
<author>
<name>Sam Larson</name>
<email>166414725+saml212@users.noreply.github.com</email>
</author>
<published>2026-03-19T21:28:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/parameter-golf.git/commit/?id=555669e8330472143139c2f82bba15baab1a5e0d'/>
<id>555669e8330472143139c2f82bba15baab1a5e0d</id>
<content type='text'>
* Warmdown-quantization co-optimization, val_bpb=1.2154

Novel finding: aggressive LR decay (WARMDOWN_ITERS=20000) reduces int8 quantization
penalty from 0.014 to 0.005 BPB. Combined with FP16 tied embeddings and moderate
NTK-RoPE extrapolation (eval@1408).

Full warmdown sweep across 10 values and detailed analysis in README.

* breakthrough: 1.1574 BPB via int6 + MLP 3x + sliding window stride=256

---------

Co-authored-by: Sam Larson &lt;saml212@users.noreply.github.com&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Warmdown-quantization co-optimization, val_bpb=1.2154

Novel finding: aggressive LR decay (WARMDOWN_ITERS=20000) reduces int8 quantization
penalty from 0.014 to 0.005 BPB. Combined with FP16 tied embeddings and moderate
NTK-RoPE extrapolation (eval@1408).

Full warmdown sweep across 10 values and detailed analysis in README.

* breakthrough: 1.1574 BPB via int6 + MLP 3x + sliding window stride=256

---------

Co-authored-by: Sam Larson &lt;saml212@users.noreply.github.com&gt;</pre>
</div>
</content>
</entry>
</feed>
