I encountered this issue while running the simple math example in SDK mode. What's strange is that it ran successfully a few days ago without any modifications, but now it's throwing an error.
(TaskRunner pid=99857) ERROR:2026-03-30 02:33:06,181:[f4c7d89f-fa83-48c0-a370-324084196b05:9:3] Rollout failed: Traceback (most recent call last):
(TaskRunner pid=99857) File "/workspace/rllm/rllm/engine/agent_sdk_engine.py", line 180, in _execute_with_exception_handling
(TaskRunner pid=99857) output, session_uid = await loop.run_in_executor(self.executor, bound_func)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/lib/python3.12/concurrent/futures/thread.py", line 59, in run
(TaskRunner pid=99857) result = self.fn(*self.args, **self.kwargs)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/workspace/rllm/rllm/sdk/session/base.py", line 64, in wrapped_sync
(TaskRunner pid=99857) output = agent_func(*args, **kwargs)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/workspace/rllm/examples/sdk/simple_math/train_hendrycks_math.py", line 25, in rollout
(TaskRunner pid=99857) response = client.chat.completions.create(
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/_utils/_utils.py", line 286, in wrapper
(TaskRunner pid=99857) return func(*args, **kwargs)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/resources/chat/completions/completions.py", line 1211, in create
(TaskRunner pid=99857) return self._post(
(TaskRunner pid=99857) ^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/_base_client.py", line 1297, in post
(TaskRunner pid=99857) return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/workspace/rllm/rllm/sdk/chat/openai.py", line 398, in request
(TaskRunner pid=99857) response = OpenAI.request(temp_client, cast_to, options, stream=stream, stream_cls=stream_cls)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/_base_client.py", line 1070, in request
(TaskRunner pid=99857) raise self._make_status_error_from_response(err.response) from None
(TaskRunner pid=99857) openai.RateLimitError: Error code: 429 - {'error': {'message': "No deployments available for selected model, Try again in 5 seconds. Passed model=/home/shl/models/DeepSeek-R1-Distill-Qwen-1.5B. pre-call-checks=False, cooldown_list=[('verl-replica-0', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.0375342, 'cooldown_time': 5}), ('verl-replica-1', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.7071717, 'cooldown_time': 5}), ('verl-replica-2', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863183.0246875, 'cooldown_time': 5}), ('verl-replica-3', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.2435002, 'cooldown_time': 5}), ('verl-replica-4', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863181.8245385, 'cooldown_time': 5}), ('verl-replica-5', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863181.8098967, 'cooldown_time': 5}), ('verl-replica-6', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.1531959, 'cooldown_time': 5}), ('verl-replica-7', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863183.4203734, 'cooldown_time': 5})]", 'type': 'None', 'param': 'None', 'code': '429'}}
I encountered this issue while running the simple math example in SDK mode. What's strange is that it ran successfully a few days ago without any modifications, but now it's throwing an error.
(TaskRunner pid=99857) ERROR:2026-03-30 02:33:06,181:[f4c7d89f-fa83-48c0-a370-324084196b05:9:3] Rollout failed: Traceback (most recent call last):
(TaskRunner pid=99857) File "/workspace/rllm/rllm/engine/agent_sdk_engine.py", line 180, in _execute_with_exception_handling
(TaskRunner pid=99857) output, session_uid = await loop.run_in_executor(self.executor, bound_func)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/lib/python3.12/concurrent/futures/thread.py", line 59, in run
(TaskRunner pid=99857) result = self.fn(*self.args, **self.kwargs)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/workspace/rllm/rllm/sdk/session/base.py", line 64, in wrapped_sync
(TaskRunner pid=99857) output = agent_func(*args, **kwargs)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/workspace/rllm/examples/sdk/simple_math/train_hendrycks_math.py", line 25, in rollout
(TaskRunner pid=99857) response = client.chat.completions.create(
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/_utils/_utils.py", line 286, in wrapper
(TaskRunner pid=99857) return func(*args, **kwargs)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/resources/chat/completions/completions.py", line 1211, in create
(TaskRunner pid=99857) return self._post(
(TaskRunner pid=99857) ^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/_base_client.py", line 1297, in post
(TaskRunner pid=99857) return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/workspace/rllm/rllm/sdk/chat/openai.py", line 398, in request
(TaskRunner pid=99857) response = OpenAI.request(temp_client, cast_to, options, stream=stream, stream_cls=stream_cls)
(TaskRunner pid=99857) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=99857) File "/usr/local/lib/python3.12/dist-packages/openai/_base_client.py", line 1070, in request
(TaskRunner pid=99857) raise self._make_status_error_from_response(err.response) from None
(TaskRunner pid=99857) openai.RateLimitError: Error code: 429 - {'error': {'message': "No deployments available for selected model, Try again in 5 seconds. Passed model=/home/shl/models/DeepSeek-R1-Distill-Qwen-1.5B. pre-call-checks=False, cooldown_list=[('verl-replica-0', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.0375342, 'cooldown_time': 5}), ('verl-replica-1', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.7071717, 'cooldown_time': 5}), ('verl-replica-2', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863183.0246875, 'cooldown_time': 5}), ('verl-replica-3', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.2435002, 'cooldown_time': 5}), ('verl-replica-4', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863181.8245385, 'cooldown_time': 5}), ('verl-replica-5', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863181.8098967, 'cooldown_time': 5}), ('verl-replica-6', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863182.1531959, 'cooldown_time': 5}), ('verl-replica-7', {'exception_received': 'litellm.InternalServerError: InternalServerError: Hosted_vllmException - Connection error.', 'status_code': '500', 'timestamp': 1774863183.4203734, 'cooldown_time': 5})]", 'type': 'None', 'param': 'None', 'code': '429'}}