-
Bug
-
Resolution: Done
-
Normal
-
RHODS_1.26.0_GA
-
2
-
False
-
None
-
False
-
Testable
-
No
-
-
-
-
-
-
-
1.28.0
-
No
-
No
-
Pending
-
None
-
-
-
ML Serving Sprint 1.28, ML Serving Sprint 1.29
Description of problem:
An existing installation of RHODS had a complete break of the `odh-model-controller` pods after upgrading to OCP 4.13, and it looks like the root cause is the clusterrole being used (`manager-role`) which is an example one and used by RHACM as well, which might have caused the issue to RHODS during its own upgrade.
RHODS should use a different clusterrole to avoid any further issues.
More details: https://redhat-internal.slack.com/archives/C03UGJY6Z1A/p1684747070780669
Prerequisites (if any, like setup, operators/versions):
RHODS + RHACM (or any other project that uses the same clusterrole)
Steps to Reproduce
Unclear, but likely an upgrade to RHACM overrode the clusterrole that the odh-model-controller pods are using
Actual results:
odh-model-controller pods keep restarting
Expected results:
No issues caused by other projects
Reproducibility (Always/Intermittent/Only Once):
Happened only once that we know of, likely can happen again.
Build Details:
Workaround:
Additional info:
- links to
- mentioned on