2025-12-26 02:38:46,860 - INFO - Loaded dataset: gpqa
2025-12-26 02:38:46,861 - INFO - Loaded dataset: aime
2025-12-26 02:38:46,861 - INFO - Loaded dataset: math-hard
2025-12-26 02:38:46,861 - INFO - Loaded dataset: humaneval
2025-12-26 02:38:46,872 - INFO - Loaded 100 profiles from ../data/complex_profiles_v2/profiles_100.jsonl
2025-12-26 02:38:46,872 - INFO - Running method: vanilla
`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]Loading checkpoint shards:  25%|██▌       | 1/4 [00:09<00:27,  9.02s/it]Loading checkpoint shards:  50%|█████     | 2/4 [00:12<00:11,  5.81s/it]Loading checkpoint shards:  75%|███████▌  | 3/4 [00:21<00:07,  7.02s/it]Loading checkpoint shards: 100%|██████████| 4/4 [00:21<00:00,  4.42s/it]Loading checkpoint shards: 100%|██████████| 4/4 [00:21<00:00,  5.37s/it]
Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]Loading checkpoint shards:  20%|██        | 1/5 [00:04<00:16,  4.19s/it]Loading checkpoint shards:  40%|████      | 2/5 [00:07<00:10,  3.44s/it]Loading checkpoint shards:  60%|██████    | 3/5 [00:10<00:07,  3.63s/it]Loading checkpoint shards:  80%|████████  | 4/5 [00:14<00:03,  3.44s/it]Loading checkpoint shards: 100%|██████████| 5/5 [00:15<00:00,  2.65s/it]Loading checkpoint shards: 100%|██████████| 5/5 [00:15<00:00,  3.07s/it]
2025-12-26 02:39:34,147 - INFO -   Profile 1/30
Generating train split:   0%|          | 0/90 [00:00<?, ? examples/s]Generating train split: 100%|██████████| 90/90 [00:00<00:00, 1116.85 examples/s]
/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/awq/__init__.py:21: DeprecationWarning: 
I have left this message as the final dev message to help you transition.

Important Notice:
- AutoAWQ is officially deprecated and will no longer be maintained.
- The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
- If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.

Alternative:
- AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor

For further inquiries, feel free to reach out:
- X: https://x.com/casper_hansen_
- LinkedIn: https://www.linkedin.com/in/casper-hansen-804005170/

  warnings.warn(_FINAL_DEV_MESSAGE, category=DeprecationWarning, stacklevel=1)
Loading checkpoint shards:   0%|          | 0/9 [00:00<?, ?it/s]Loading checkpoint shards:  11%|█         | 1/9 [00:05<00:41,  5.22s/it]Loading checkpoint shards:  22%|██▏       | 2/9 [00:13<00:49,  7.11s/it]Loading checkpoint shards:  33%|███▎      | 3/9 [00:23<00:48,  8.17s/it]Loading checkpoint shards:  44%|████▍     | 4/9 [00:33<00:44,  8.98s/it]Loading checkpoint shards:  56%|█████▌    | 5/9 [00:45<00:40, 10.19s/it]Loading checkpoint shards:  67%|██████▋   | 6/9 [00:56<00:31, 10.44s/it]Loading checkpoint shards:  78%|███████▊  | 7/9 [01:04<00:19,  9.67s/it]Loading checkpoint shards:  89%|████████▉ | 8/9 [01:08<00:07,  7.94s/it]Loading checkpoint shards: 100%|██████████| 9/9 [01:10<00:00,  5.95s/it]Loading checkpoint shards: 100%|██████████| 9/9 [01:10<00:00,  7.83s/it]
2025-12-26 02:40:51,255 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,261 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,266 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,271 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,276 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,281 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,286 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,291 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,296 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,301 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,306 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,310 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,315 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,320 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,326 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,330 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,335 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,341 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,345 - WARNING - User agent failed to respond at turn 0
2025-12-26 02:40:51,350 - WARNING - User agent failed to respond at turn 0
Generating train split: 0 examples [00:00, ? examples/s]Generating train split: 2304 examples [00:00, 29493.27 examples/s]
Generating test split: 0 examples [00:00, ? examples/s]Generating test split: 1324 examples [00:00, 28145.98 examples/s]
Traceback (most recent call last):
  File "/projects/bfqt/users/yurenh2/ml-projects/personalization-user-model/collaborativeagents/scripts/run_experiments.py", line 623, in <module>
    main()
  File "/projects/bfqt/users/yurenh2/ml-projects/personalization-user-model/collaborativeagents/scripts/run_experiments.py", line 608, in main
    analysis = runner.run_all()
               ^^^^^^^^^^^^^^^^
  File "/projects/bfqt/users/yurenh2/ml-projects/personalization-user-model/collaborativeagents/scripts/run_experiments.py", line 414, in run_all
    results = self.run_method(method)
              ^^^^^^^^^^^^^^^^^^^^^^^
  File "/projects/bfqt/users/yurenh2/ml-projects/personalization-user-model/collaborativeagents/scripts/run_experiments.py", line 367, in run_method
    samples = dataset.get_testset()
              ^^^^^^^^^^^^^^^^^^^^^
  File "/projects/bfqt/users/yurenh2/ml-projects/personalization-user-model/collaborativeagents/datasets_extended.py", line 71, in get_testset
    self._test_data = self._load_data("test")[:self.eval_size]
                      ^^^^^^^^^^^^^^^^^^^^^^^
  File "/projects/bfqt/users/yurenh2/ml-projects/personalization-user-model/collaborativeagents/datasets_extended.py", line 153, in _load_data
    solution=item["answer"],
             ~~~~^^^^^^^^^^
KeyError: 'answer'