-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
4.17.z, 4.16.z, 4.18.z
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
OpenShift Lightspeed is using the deprecated max_tokens which will lead to the following error
[ols.app.endpoints.health:health.py:59] ERROR: LLM connection check failed with - Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
Version-Release number of selected component (if applicable):
OpenShift Lightspeed 1.0
Model: gpt-5-mini
How reproducible:
Steps to Reproduce:
1.
2.
3.
Actual results:
Using the models from above will lead to 400 errors
Expected results:
Can use models from the above section
Additional info:
- The max_tokens parameter is deprecated and max_completion_tokens shall be used instead according to https://platform.openai.com/docs/api-reference/chat/create. - The AzureOpenAI code still has max_tokens in use https://github.com/openshift/lightspeed-service/blob/main/ols/src/llms/providers/azure_openai.py#L74