summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorYuren Hao <yurenh2@illinois.edu>2026-04-08 22:11:29 -0500
committerYuren Hao <yurenh2@illinois.edu>2026-04-08 22:11:29 -0500
commit5eb005d3724e2470f476995c3ebbcb67e8b04ca5 (patch)
tree76590ab46200c74210fff3d5e5b5fa158e408e05 /README.md
parent185404c245b360df5bc8398f9331c481c16f01f8 (diff)
README badges: add PutnamGAP mirror + GAP framework cross-link badgesHEADmain
Diffstat (limited to 'README.md')
-rw-r--r--README.md2
1 files changed, 2 insertions, 0 deletions
diff --git a/README.md b/README.md
index a64a3f1..d29250c 100644
--- a/README.md
+++ b/README.md
@@ -2,6 +2,8 @@
[![arXiv](https://img.shields.io/badge/arXiv-2508.08833-b31b1b.svg)](https://arxiv.org/abs/2508.08833)
[![Hugging Face](https://img.shields.io/badge/🤗_Dataset-PutnamGAP-yellow.svg)](https://huggingface.co/datasets/blackhao0426/PutnamGAP)
+[![Dataset Mirror](https://img.shields.io/badge/GitHub-PutnamGAP_mirror-blue?logo=github)](https://github.com/YurenHao0426/PutnamGAP)
+[![GAP Code](https://img.shields.io/badge/GitHub-GAP_framework-181717?logo=github)](https://github.com/YurenHao0426/GAP)
[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
**GAP** (*Generalization-and-Perturbation*) is an automatable evaluation framework for stress-testing the **robustness of LLM mathematical reasoning** under semantically equivalent transformations of advanced math problems. It partitions equivalence-preserving transformations into two qualitatively different families — **surface renaming** and **kernel parameter resampling** — and provides paired-evaluation, mechanism-sensitive analyses that prior perturbation benchmarks cannot.