-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.21
-
Quality / Stability / Reliability
-
False
-
-
None
-
Low
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem
While checking oc adm inspect clusteroperator output in CI, I noticed that many components install ClusterRole(Binding)s via release image manifests, but fail to mention them in their ClusterOperator's relatedObjects. The network ClusterOperator is one of these, with https://amd64.ocp.releases.ci.openshift.org/ > 4-dev-preview > 4.21.0-ec.2 > aws-ovn-serial-1of2 > Artifacts> inspected ClusterOperators:
$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial-1of2/1980906989932253184/artifacts/e2e-aws-ovn-serial/gather-extra/artifacts/inspect/cluster-scoped-resources/config.openshift.io/clusteroperators/network.yaml | yaml2json | jq -c '.status.relatedObjects[]' | grep clusterrolebindings | sort
{"group":"rbac.authorization.k8s.io","name":"cloud-network-config-controller","resource":"clusterrolebindings"}
{"group":"rbac.authorization.k8s.io","name":"metrics-daemon-sa-rolebinding","resource":"clusterrolebindings"}
...
despite requesting a cluster-network-operator ClusterRoleBinding. To facilitate the gathering of resources relevant to the component, the ClusterOperator's relatedObjects should be expanded to reference that ClusterRoleBinding, and any other resources that might be relevant to debugging the component, as described in the ClusterOperator docs. Note that some inspect lookup is implicit as part of a namespace reference, but that will obviously not pick up resources that are cluster scoped, like ClusterRole.
Version-Release number of selected component
Seen in 4.21.0-ec.2 CI. Likely applies to many other versions, but I have not audited.
How reproducible
Every time.
Steps to Reproduce
1. Install a cluster.
2. Inspect the ClusterOperator: oc adm inspect clusteroperator/network.
3. Ensure all the resources relevant to debugging that component are present in the output.
Actual results
$ ls inspect.local.*/cluster-scoped-resources/rbac.authorization.k8s.io/clusterrolebindings | sort | head -n2 cloud-network-config-controller.yaml metrics-daemon-sa-rolebinding.yaml
Expected results
cluster-scoped-resources/rbac.authorization.k8s.io/clusterrolebindings/cluster-network-operator.yaml should be collected, along with any other cluster-scoped resources which would be useful for debugging the component.
Additional info
In addition to expanding relatedObjects in your reconciled ClusterOperator status (likely Go code in your controller), you'll want to add entries to your ClusterOperator release image manifest, so the CVO can put that entry in place if your operator fails to install, to allow you to debug "why is my Go controller failing to update ClusterOperator status.relatedObjects?".
You may also want to grow a component-specific ClusterRole, instead of using cluster-admin.
- clones
-
OCPBUGS-65469 cloud-controller-manager ClusterOperator relatedObjects missing ClusterRole
-
- POST
-