logs/grpo_reflection_15498033.err


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
[0;36m(APIServer pid=1914169)[0;0m The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
[0;36m(APIServer pid=1914169)[0;0m 
Parse safetensors files:   0%|          | 0/9 [00:00<?, ?it/s]
Parse safetensors files:  11%|█         | 1/9 [00:00<00:00,  8.22it/s]
Parse safetensors files: 100%|██████████| 9/9 [00:00<00:00, 49.13it/s]
/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:   0% Completed | 0/9 [00:00<?, ?it/s]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  11% Completed | 1/9 [00:01<00:13,  1.67s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  22% Completed | 2/9 [00:03<00:11,  1.71s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  33% Completed | 3/9 [00:04<00:08,  1.48s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  44% Completed | 4/9 [00:06<00:07,  1.57s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  56% Completed | 5/9 [00:08<00:06,  1.63s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  67% Completed | 6/9 [00:09<00:04,  1.67s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  78% Completed | 7/9 [00:11<00:03,  1.65s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards:  89% Completed | 8/9 [00:12<00:01,  1.31s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards: 100% Completed | 9/9 [00:13<00:00,  1.43s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Loading safetensors checkpoint shards: 100% Completed | 9/9 [00:13<00:00,  1.52s/it]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   0%|          | 0/51 [00:00<?, ?it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   2%|▏         | 1/51 [00:00<00:19,  2.53it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   4%|▍         | 2/51 [00:00<00:15,  3.14it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   6%|▌         | 3/51 [00:00<00:13,  3.56it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):   8%|▊         | 4/51 [00:01<00:12,  3.82it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  10%|▉         | 5/51 [00:01<00:11,  4.10it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  12%|█▏        | 6/51 [00:01<00:10,  4.27it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  14%|█▎        | 7/51 [00:01<00:10,  4.30it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  16%|█▌        | 8/51 [00:01<00:09,  4.44it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  18%|█▊        | 9/51 [00:02<00:08,  4.67it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  20%|█▉        | 10/51 [00:02<00:08,  4.80it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  22%|██▏       | 11/51 [00:02<00:08,  4.92it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  24%|██▎       | 12/51 [00:02<00:07,  5.03it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  25%|██▌       | 13/51 [00:02<00:07,  5.28it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  27%|██▋       | 14/51 [00:03<00:06,  5.43it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  29%|██▉       | 15/51 [00:03<00:06,  5.56it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  31%|███▏      | 16/51 [00:03<00:06,  5.69it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  33%|███▎      | 17/51 [00:03<00:05,  5.99it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  35%|███▌      | 18/51 [00:03<00:05,  6.10it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  37%|███▋      | 19/51 [00:03<00:05,  6.25it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  39%|███▉      | 20/51 [00:04<00:04,  6.35it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  41%|████      | 21/51 [00:04<00:04,  6.47it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  43%|████▎     | 22/51 [00:04<00:04,  6.53it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  45%|████▌     | 23/51 [00:04<00:04,  6.62it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  47%|████▋     | 24/51 [00:04<00:04,  6.69it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  49%|████▉     | 25/51 [00:04<00:03,  7.02it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  51%|█████     | 26/51 [00:04<00:03,  7.10it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  53%|█████▎    | 27/51 [00:05<00:03,  7.26it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  55%|█████▍    | 28/51 [00:05<00:03,  7.37it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  57%|█████▋    | 29/51 [00:05<00:02,  7.52it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  59%|█████▉    | 30/51 [00:05<00:02,  7.62it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  61%|██████    | 31/51 [00:05<00:02,  7.73it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  63%|██████▎   | 32/51 [00:05<00:02,  7.82it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  65%|██████▍   | 33/51 [00:05<00:02,  8.22it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  67%|██████▋   | 34/51 [00:05<00:02,  8.23it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  69%|██████▊   | 35/51 [00:06<00:01,  8.42it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  71%|███████   | 36/51 [00:06<00:01,  8.57it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  73%|███████▎  | 37/51 [00:06<00:01,  8.67it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  75%|███████▍  | 38/51 [00:06<00:01,  8.84it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  76%|███████▋  | 39/51 [00:06<00:01,  9.01it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  78%|███████▊  | 40/51 [00:06<00:01,  9.02it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  82%|████████▏ | 42/51 [00:06<00:00,  9.99it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  86%|████████▋ | 44/51 [00:06<00:00, 10.51it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  90%|█████████ | 46/51 [00:07<00:00, 10.89it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  94%|█████████▍| 48/51 [00:07<00:00, 11.13it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE):  98%|█████████▊| 50/51 [00:07<00:00, 11.33it/s]
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|██████████| 51/51 [00:07<00:00,  6.80it/s]
[0;36m(Worker_TP0 pid=1914626)[0;0m 
Capturing CUDA graphs (decode, FULL):   0%|          | 0/35 [00:00<?, ?it/s]
Capturing CUDA graphs (decode, FULL):   3%|▎         | 1/35 [00:00<00:06,  4.97it/s]
Capturing CUDA graphs (decode, FULL):   6%|▌         | 2/35 [00:00<00:05,  5.80it/s]
Capturing CUDA graphs (decode, FULL):   9%|▊         | 3/35 [00:00<00:05,  6.25it/s]
Capturing CUDA graphs (decode, FULL):  11%|█▏        | 4/35 [00:00<00:04,  6.49it/s]
Capturing CUDA graphs (decode, FULL):  14%|█▍        | 5/35 [00:00<00:04,  6.70it/s]
Capturing CUDA graphs (decode, FULL):  17%|█▋        | 6/35 [00:00<00:04,  6.84it/s]
Capturing CUDA graphs (decode, FULL):  20%|██        | 7/35 [00:01<00:04,  6.97it/s]
Capturing CUDA graphs (decode, FULL):  23%|██▎       | 8/35 [00:01<00:03,  7.07it/s]
Capturing CUDA graphs (decode, FULL):  26%|██▌       | 9/35 [00:01<00:03,  7.46it/s]
Capturing CUDA graphs (decode, FULL):  29%|██▊       | 10/35 [00:01<00:03,  7.55it/s]
Capturing CUDA graphs (decode, FULL):  31%|███▏      | 11/35 [00:01<00:03,  7.73it/s]
Capturing CUDA graphs (decode, FULL):  34%|███▍      | 12/35 [00:01<00:02,  7.86it/s]
Capturing CUDA graphs (decode, FULL):  37%|███▋      | 13/35 [00:01<00:02,  8.05it/s]
Capturing CUDA graphs (decode, FULL):  40%|████      | 14/35 [00:01<00:02,  8.17it/s]
Capturing CUDA graphs (decode, FULL):  43%|████▎     | 15/35 [00:02<00:02,  8.31it/s]
Capturing CUDA graphs (decode, FULL):  46%|████▌     | 16/35 [00:02<00:02,  8.43it/s]
Capturing CUDA graphs (decode, FULL):  51%|█████▏    | 18/35 [00:02<00:01,  9.01it/s]
Capturing CUDA graphs (decode, FULL):  54%|█████▍    | 19/35 [00:02<00:01,  9.19it/s]
Capturing CUDA graphs (decode, FULL):  57%|█████▋    | 20/35 [00:02<00:01,  9.35it/s]
Capturing CUDA graphs (decode, FULL):  63%|██████▎   | 22/35 [00:02<00:01,  9.70it/s]
Capturing CUDA graphs (decode, FULL):  69%|██████▊   | 24/35 [00:02<00:01, 10.01it/s]
Capturing CUDA graphs (decode, FULL):  74%|███████▍  | 26/35 [00:03<00:00, 10.69it/s]
Capturing CUDA graphs (decode, FULL):  80%|████████  | 28/35 [00:03<00:00, 11.15it/s]
Capturing CUDA graphs (decode, FULL):  86%|████████▌ | 30/35 [00:03<00:00, 11.50it/s]
Capturing CUDA graphs (decode, FULL):  91%|█████████▏| 32/35 [00:03<00:00, 11.79it/s]
Capturing CUDA graphs (decode, FULL):  97%|█████████▋| 34/35 [00:03<00:00, 12.00it/s]
Capturing CUDA graphs (decode, FULL): 100%|██████████| 35/35 [00:03<00:00,  9.12it/s]
[0;36m(APIServer pid=1914169)[0;0m INFO:     Started server process [1914169]
[0;36m(APIServer pid=1914169)[0;0m INFO:     Waiting for application startup.
[0;36m(APIServer pid=1914169)[0;0m INFO:     Application startup complete.
/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/trl/import_utils.py:91: UserWarning: TRL currently only supports vLLM version `0.10.2`. You have version 0.13.0 installed. We recommend to install this version to avoid compatibility issues.
  warnings.warn(
/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/trl/import_utils.py:91: UserWarning: TRL currently only supports vLLM version `0.10.2`. You have version 0.13.0 installed. We recommend to install this version to avoid compatibility issues.
  warnings.warn(
Traceback (most recent call last):
  File "/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/trl/import_utils.py", line 156, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/u/yurenh2/miniforge3/envs/eval/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/trl/trainer/grpo_trainer.py", line 85, in <module>
    from vllm.sampling_params import GuidedDecodingParams
ImportError: cannot import name 'GuidedDecodingParams' from 'vllm.sampling_params' (/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/vllm/sampling_params.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/projects/bfqt/users/yurenh2/ml-projects/personalization-user-model/collaborativeagents/training/train_grpo.py", line 21, in <module>
    from trl import GRPOConfig, GRPOTrainer
  File "<frozen importlib._bootstrap>", line 1229, in _handle_fromlist
  File "/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/trl/import_utils.py", line 147, in __getattr__
    value = getattr(module, name)
            ^^^^^^^^^^^^^^^^^^^^^
  File "/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/trl/import_utils.py", line 146, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/trl/import_utils.py", line 158, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import trl.trainer.grpo_trainer because of the following error (look up to see its traceback):
cannot import name 'GuidedDecodingParams' from 'vllm.sampling_params' (/u/yurenh2/miniforge3/envs/eval/lib/python3.11/site-packages/vllm/sampling_params.py)