model,step,full_exact,full_token_acc,lm_loss,dyn_sample_exact,lambda1_all,mean8_all,tail4_all,pos_count_all TRM baseline best,58590,0.8686309456825256,0.9508475661277772,0.1155559569597244,0.875,0.02823458132615997,0.013457294571722192,0.0075273313675370546,7.841796875 TRM multi4 best,35805,0.8964653611183167,0.9604493975639344,0.0945562794804573,0.900390625,0.020381716455975862,0.0065844104191477015,0.001402141885882835,3.841796875 TRM multi4 final,65100,0.8350536823272705,0.9350366592407228,0.1547555029392242,0.82421875,0.03232463403946895,0.018508151198432188,0.013372940185377047,8.0