summaryrefslogtreecommitdiff
path: root/Group-Entropy-Equalization/README.md
diff options
context:
space:
mode:
authorblackhao <13851610112@163.com>2025-08-23 13:35:13 -0500
committerblackhao <13851610112@163.com>2025-08-23 13:35:13 -0500
commit4f81a87ef95b190450ed5202bfa725dbb0a539f4 (patch)
tree875f5966cdaaa526d85ff49a13cd6bf27ab4a723 /Group-Entropy-Equalization/README.md
parentad3e216afd066375219ef8b3928ef4096237fbf6 (diff)
init
Diffstat (limited to 'Group-Entropy-Equalization/README.md')
-rw-r--r--Group-Entropy-Equalization/README.md90
1 files changed, 90 insertions, 0 deletions
diff --git a/Group-Entropy-Equalization/README.md b/Group-Entropy-Equalization/README.md
new file mode 100644
index 0000000..804af95
--- /dev/null
+++ b/Group-Entropy-Equalization/README.md
@@ -0,0 +1,90 @@
+# One-shot Entropy Minimization
+
+[![paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2505.20282)
+[![Model](https://img.shields.io/badge/Models/Dataset-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/zgao3186/qwen25math7b-one-shot-em/)
+[![Notion](https://img.shields.io/badge/Site-000000.svg?style=for-the-badge&logo=notion&logoColor=white)](https://www.notion.so/One-shot-Entropy-Minimization-202606db813b80639773f850f39246a5)
+
+### Installation
+
+```bash
+conda create -n one-shot-em python=3.10 -y
+pip install -r requirements.txt
+```
+
+---
+
+### Reproducing One-shot EM Training (SOTA)
+
+```bash
+accelerate launch train.py \
+ --model_name Qwen2.5-Math-7B \
+ --model_path /path/to/Qwen2.5-Math-7B \
+ --train_data dataset/1shot_rlvr/pi1_r1280.parquet \
+ --effective_batch 64 \
+ --micro_batch_size 2 \
+ --temperature 0.5 \
+ --learning_rate 2e-5 \
+ --max_steps 50 \
+ --log_steps 1 \
+ --save_steps 1 \
+ --run_name one_shot \
+ --wandb_project one-shot-em
+```
+
+---
+
+### Reproducing Multi-shot EM Training
+
+```bash
+accelerate launch train.py \
+ --model_name Qwen2.5-Math-7B \
+ --model_path /path/to/Qwen2.5-Math-7B \
+ --train_data dataset/numina/numina_00.parquet \
+ --effective_batch 64 \
+ --micro_batch_size 2 \
+ --temperature 0.5 \
+ --learning_rate 2e-5 \
+ --max_steps 50 \
+ --log_steps 1 \
+ --save_steps 1 \
+ --run_name multi_shot \
+ --wandb_project one-shot-em
+```
+
+---
+
+### Evaluation
+
+```bash
+cd Qwen2.5-Eval/evaluation
+bash sh/eval_all_math.sh
+```
+
+---
+
+### Acknowledgements
+
+Our dataset references and builds upon the following open-source contributions:
+
+- [NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT)
+- [DeepScaler](https://github.com/agentica-project/deepscaler)
+- [One-shot RLVR](https://github.com/ypwang61/One-Shot-RLVR/) – for data selection strategies
+- [Qwen2.5-Eval](https://github.com/QwenLM/Qwen2.5-Math/) – for evaluation benchmarks
+
+We sincerely thank the authors and maintainers of these projects for their excellent contributions to the research community!
+
+
+---
+
+### Citation
+```
+@misc{gao2025oneshotentropyminimization,
+ title={One-shot Entropy Minimization},
+ author={Zitian Gao and Lynx Chen and Haoming Luo and Joey Zhou and Bryan Dai},
+ year={2025},
+ eprint={2505.20282},
+ archivePrefix={arXiv},
+ primaryClass={cs.CL},
+ url={https://arxiv.org/abs/2505.20282},
+}
+```