diff options
| author | blackhao <13851610112@163.com> | 2025-08-23 13:35:13 -0500 |
|---|---|---|
| committer | blackhao <13851610112@163.com> | 2025-08-23 13:35:13 -0500 |
| commit | 4f81a87ef95b190450ed5202bfa725dbb0a539f4 (patch) | |
| tree | 875f5966cdaaa526d85ff49a13cd6bf27ab4a723 /Group-Entropy-Equalization/README.md | |
| parent | ad3e216afd066375219ef8b3928ef4096237fbf6 (diff) | |
init
Diffstat (limited to 'Group-Entropy-Equalization/README.md')
| -rw-r--r-- | Group-Entropy-Equalization/README.md | 90 |
1 files changed, 90 insertions, 0 deletions
diff --git a/Group-Entropy-Equalization/README.md b/Group-Entropy-Equalization/README.md new file mode 100644 index 0000000..804af95 --- /dev/null +++ b/Group-Entropy-Equalization/README.md @@ -0,0 +1,90 @@ +# One-shot Entropy Minimization + +[](https://arxiv.org/abs/2505.20282) +[](https://huggingface.co/zgao3186/qwen25math7b-one-shot-em/) +[](https://www.notion.so/One-shot-Entropy-Minimization-202606db813b80639773f850f39246a5) + +### Installation + +```bash +conda create -n one-shot-em python=3.10 -y +pip install -r requirements.txt +``` + +--- + +### Reproducing One-shot EM Training (SOTA) + +```bash +accelerate launch train.py \ + --model_name Qwen2.5-Math-7B \ + --model_path /path/to/Qwen2.5-Math-7B \ + --train_data dataset/1shot_rlvr/pi1_r1280.parquet \ + --effective_batch 64 \ + --micro_batch_size 2 \ + --temperature 0.5 \ + --learning_rate 2e-5 \ + --max_steps 50 \ + --log_steps 1 \ + --save_steps 1 \ + --run_name one_shot \ + --wandb_project one-shot-em +``` + +--- + +### Reproducing Multi-shot EM Training + +```bash +accelerate launch train.py \ + --model_name Qwen2.5-Math-7B \ + --model_path /path/to/Qwen2.5-Math-7B \ + --train_data dataset/numina/numina_00.parquet \ + --effective_batch 64 \ + --micro_batch_size 2 \ + --temperature 0.5 \ + --learning_rate 2e-5 \ + --max_steps 50 \ + --log_steps 1 \ + --save_steps 1 \ + --run_name multi_shot \ + --wandb_project one-shot-em +``` + +--- + +### Evaluation + +```bash +cd Qwen2.5-Eval/evaluation +bash sh/eval_all_math.sh +``` + +--- + +### Acknowledgements + +Our dataset references and builds upon the following open-source contributions: + +- [NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT) +- [DeepScaler](https://github.com/agentica-project/deepscaler) +- [One-shot RLVR](https://github.com/ypwang61/One-Shot-RLVR/) – for data selection strategies +- [Qwen2.5-Eval](https://github.com/QwenLM/Qwen2.5-Math/) – for evaluation benchmarks + +We sincerely thank the authors and maintainers of these projects for their excellent contributions to the research community! + + +--- + +### Citation +``` +@misc{gao2025oneshotentropyminimization, + title={One-shot Entropy Minimization}, + author={Zitian Gao and Lynx Chen and Haoming Luo and Joey Zhou and Bryan Dai}, + year={2025}, + eprint={2505.20282}, + archivePrefix={arXiv}, + primaryClass={cs.CL}, + url={https://arxiv.org/abs/2505.20282}, +} +``` |
