diff options
| author | blackhao <13851610112@163.com> | 2025-08-23 14:17:47 -0500 |
|---|---|---|
| committer | blackhao <13851610112@163.com> | 2025-08-23 14:17:47 -0500 |
| commit | 8b2cf4b226de17227aa95abb637d1410bc2e57e3 (patch) | |
| tree | 6301c78a7c98b590f4a14487e4547f561389075d /Group-Entropy-Equalization/README.md | |
| parent | f21f7dd85365b10505bbd1cfa28f6a8648ba1b7e (diff) | |
feat(gee): add GEE objective and flags; add groups/gender.json; docs for Colab and GEE
Diffstat (limited to 'Group-Entropy-Equalization/README.md')
| -rw-r--r-- | Group-Entropy-Equalization/README.md | 29 |
1 files changed, 29 insertions, 0 deletions
diff --git a/Group-Entropy-Equalization/README.md b/Group-Entropy-Equalization/README.md index 33bd020..0ea2010 100644 --- a/Group-Entropy-Equalization/README.md +++ b/Group-Entropy-Equalization/README.md @@ -57,6 +57,35 @@ Checkpoints are saved under `checkpoints/<model>/<run_name>/`. --- +### Group-wise Entropy Equalization (GEE) + +GEE balances sensitive groups by: +- Group mass parity (push group probability mass toward target pi) +- Group entropy equalization (normalize and equalize per-group entropy) +- Optional anchors to keep global token-entropy and sensitive-union mass close to baseline + +Default groups file: `groups/gender.json`. + +Run on Colab (example): + +```bash +!python train.py \ + --model_name Qwen2.5-1.5B \ + --model_path Qwen/Qwen2.5-1.5B \ + --train_data dataset/1shot_rlvr/pi1_r1280.parquet \ + --effective_batch 4 --micro_batch_size 1 \ + --temperature 0.5 --learning_rate 2e-5 --sample_temp 0.5 \ + --max_steps 15 --log_steps 1 --save_steps 5 \ + --run_name colab_gee15 --wandb_project one-shot-em \ + --no_deepspeed --mixed_precision no \ + --gee_enable --gee_groups_path groups/gender.json \ + --gee_alpha 1.0 --gee_beta 0.3 --gee_lambda 0.0 --gee_gamma 0.0 --gee_tau 1e-3 --gee_top_m 50 +``` + +You can customize groups and target proportions in the JSON. + +--- + ### Reproducing One-shot EM Training (SOTA) ```bash |
