-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.15
-
None
-
False
-
Description of problem:
Reported via Slack, there was a customer whose update from 4.15.21 to 4.16.8 wedged with Failing=True like this:
Could not update customresourcedefinition "clusterserviceversions.operators.coreos.com" (649 of 903): the object is invalid, possibly due to local cluster configuration
CVO log contained more information:
2024-09-03T14:11:19.672587865Z I0903 14:11:19.672572 1 sync_worker.go:1171] Update error 649 of 903: UpdatePayloadResourceInvalid Could not update customresourcedefinition "clusterserviceversions.operators.coreos.com" (649 of 903): the object is invalid, possibly due to local cluster configuration (*errors.StatusError: CustomResourceDefinition.apiextensions.k8s.io "clusterserviceversions.operators.coreos.com" is invalid: metadata.ownerReferences: Invalid value: []v1.OwnerReference{v1.OwnerReference{APIVersion:"config.openshift.io/v1", Kind:"ClusterServiceVersion", Name:"rhsso-operator.7.6.9-opr-002", UID:"00f0a902-a305-40bd-b277-2de22dca78ba", Controller:(*bool)(0xc1014fb039), BlockOwnerDeletion:(*bool)(nil)}, v1.OwnerReference{APIVersion:"config.openshift.io/v1", Kind:"ClusterVersion", Name:"version", UID:"6412f9f6-7ecf-4bfc-8277-813c9a4ef48d", Controller:(*bool)(0xc1014fb03a), BlockOwnerDeletion:(*bool)(nil)}}: Only one reference can have Controller set to true. Found "true" in references for ClusterServiceVersion/rhsso-operator.7.6.9-opr-002 and ClusterVersion/version)
The culprit was found to be a rogue controller ownerReference on the ClusterServiceVersion CRD:
$ oc --context mg get crd clusterserviceversions.operators.coreos.com -o yaml | yq .metadata.ownerReferences - apiVersion: config.openshift.io/v1 controller: true kind: ClusterServiceVersion name: rhsso-operator.7.6.9-opr-002 uid: 00f0a902-a305-40bd-b277-2de22dca78ba
No matter what put it there, CVO should just stomp it instead of wedging on it.
Version-Release number of selected component (if applicable):
Update from 4.15.21 to 4.16.8 but likely master is affected too
How reproducible:
Haven't tried, likely deterministic
Steps to Reproduce:
1. Manually put an ownerReference with controller: true on a CRD owned by CVO (like the CSV one), likely doesn't even need to be while updating
2. Eventually CVO should choke on it and start Failing=True
Actual results:
Could not update customresourcedefinition "clusterserviceversions.operators.coreos.com" (649 of 903): the object is invalid, possibly due to local cluster configuration
Expected results:
CVO overwrites the manual change with whatever is in the payload
- account is impacted by
-
OCPBUGS-39548 upgrade from 4.15.21 to 4.16.8 hang'd with "the object is invalid, possibly due to local cluster configuration"
- Closed