-
Task
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
8
-
False
-
-
False
-
-
Task Description (Required)
In RHDHPAI-513, we recently upgraded the Ollama image used on our team cluster to v0.5.5 (https://github.com/redhat-ai-dev/ollama-ubi/pull/7), however it seems to be responding to requests much slower than previous versions. It seems that GPU acceleration could almost be disabled, based on how slow the requests are.
We should investigate why it's slow, and try to fix it. For the time being, the instance on the team cluster has been downgraded to v0.4.7.