-
Bug
-
Resolution: Obsolete
-
Normal
-
None
-
None
-
None
-
None
-
False
-
None
-
False
-
Testable
-
No
-
No
-
No
-
Pending
-
None
-
-
Description of problem:
A potential reconciliation error has been encountered during the 1.33 to 2.4 upgrade in two clusters (disconnected cluster, PSI QE cluster).
2023-11-23T09:45:37Z ERROR Reconciler error {"controller": "datasciencecluster", "controllerGroup": "datasciencecluster.opendatahub.io", "controllerKind": "DataScienceCluster", "DataScienceCluster": {"name":"default-dsc"}, "namespace": "", "name": "default-dsc", "reconcileID": "0c1a32ca-7ffd-4310-8259-f6baabf3c868", "error": "1 error occurred:\n\t* Deployment.apps \"rhods-prometheus-operator\" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{\"app.kubernetes.io/part-of\":\"model-mesh\", \"app.opendatahub.io/model-mesh\":\"true\", \"k8s-app\":\"rhods-prometheus-operator\"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable\n\n"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /remote-source/operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /remote-source/operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /remote-source/operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235 2023-11-23T09:45:37Z DEBUG events DataScienceCluster instance default-dsc created, but have some failures in component 1 error occurred: * Deployment.apps "rhods-prometheus-operator" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/part-of":"model-mesh", "app.opendatahub.io/model-mesh":"true", "k8s-app":"rhods-prometheus-operator"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
Further attempts at reproducing the same issue have been unsuccessful in three different clusters and we are currently not aware of what might have triggered the issue to appear in the two environments in the first place.
Prerequisites (if any, like setup, operators/versions):
Upgrading from RHODS 1.33 to RHODS 2.x
Steps to Reproduce
Unknown
Actual results:
Reconciliation error appears in the operator pod logs / DSC conditions and the rhods-prometheus-operator deployment is not upgraded correctly.
Expected results:
No reconciliation error, deployment is upgraded correctly.
Reproducibility (Always/Intermittent/Only Once):
Twice (specific clusters). Further attempts to reproduce have been unsuccessful.
Build Details:
RHODS 1.33 / RHODS 2.4 RC3
Workaround:
If the issue is encountered, we've seen that disabling and then re-enabling the modelmesh component has fixed it in the disconnected cluster - however it is not clear why this has been the case.
In the PSI QE cluster we've confirmed that restarting the rhods operator pod has fixed the issue.
Additional info:
Unable to reproduce again, so I'm setting this Jira to normal priority - if this were reproducible it would however be a blocker issue.