Description of problem:
The deployment fails with the below error while creating the kubevirt-hyperconverged object after creating Namespace, OperatorGroup, and Subscription objects.
~~~
Error from server (InternalError): error when creating "hyper.yaml": Internal error occurred: failed calling webhook "validate-hco.kubevirt.io": Post "https://hco-webhook-service.openshift-cnv.svc:4343/validate-hco-kubevirt-io-v1beta1-hyperconverged?timeout=30s": service "hco-webhook-service" not found
~~~
The service hco-webhook-service is not created and OLM logs have got the error "could not create service hco-webhook-service: object is being deleted: services "hco-webhook-service" already exists".
~~~
2022-01-28T06:21:39.901251731Z I0128 06:21:39.901171 1 event.go:282] Event(v1.ObjectReference
): type: 'Warning' reason: 'InstallComponentFailed' install strategy failed: could not create service hco-webhook-service: object is being deleted: services "hco-webhook-service" already exists
~~~
As per the audit logs, the OLM created the service, deleted it, and then created it back again which failed.
Create event.
~~~
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"029c1ae0-21aa-4d3d-80d8-ed0a06ac45bc","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/openshift-cnv/services","verb":"create","user":{"username":"system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount","uid":"1e38b88c-ddf2-426f-bf8c-692fce5cf4e9","groups":["system:serviceaccounts","system:serviceaccounts:openshift-operator-lifecycle-manager","system:authenticated"],"extra":{"authentication.kubernetes.io/pod-name":["olm-operator-56f69cbbbf-27t6s"],"authentication.kubernetes.io/pod-uid":["3b66a65b-d54f-487c-ac8c-96f94e21b933"]}},"sourceIPs":["10.30.1.5"],"userAgent":"olm/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":
,"responseStatus":{"metadata":{},"code":201},"requestReceivedTimestamp":"2022-01-28T06:21:31.931882Z","stageTimestamp":"2022-01-28T06:21:31.970348Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"olm-operator-binding-openshift-operator-lifecycle-manager\" of ClusterRole \"system:controller:operator-lifecycle-manager\" to ServiceAccount \"olm-operator-serviceaccount/openshift-operator-lifecycle-manager\""}}
~~~
Delete and create.
~~~
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"e3cec41d-18b4-4edf-8801-33bcceec05f7","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/openshift-cnv/services/hco-webhook-service","verb":"delete","user":{"username":"system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount","uid":"1e38b88c-ddf2-426f-bf8c-692fce5cf4e9","groups":["system:serviceaccounts","system:serviceaccounts:openshift-operator-lifecycle-manager","system:authenticated"],"extra":{"authentication.kubernetes.io/pod-name":["olm-operator-56f69cbbbf-27t6s"],"authentication.kubernetes.io/pod-uid":["3b66a65b-d54f-487c-ac8c-96f94e21b933"]}},"sourceIPs":["10.30.1.5"],"userAgent":"olm/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":
,"responseStatus":{"metadata":{},"status":"Success","code":200},"requestReceivedTimestamp":"2022-01-28T06:21:38.743079Z","stageTimestamp":"2022-01-28T06:21:38.775489Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"olm-operator-binding-openshift-operator-lifecycle-manager\" of ClusterRole \"system:controller:operator-lifecycle-manager\" to ServiceAccount \"olm-operator-serviceaccount/openshift-operator-lifecycle-manager\""}}
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"c03142c9-900a-4abb-bf4d-42e511d190c0","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/openshift-cnv/services","verb":"create","user":{"username":"system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount","uid":"1e38b88c-ddf2-426f-bf8c-692fce5cf4e9","groups":["system:serviceaccounts","system:serviceaccounts:openshift-operator-lifecycle-manager","system:authenticated"],"extra":{"authentication.kubernetes.io/pod-name":["olm-operator-56f69cbbbf-27t6s"],"authentication.kubernetes.io/pod-uid":["3b66a65b-d54f-487c-ac8c-96f94e21b933"]}},"sourceIPs":["10.30.1.5"],"userAgent":"olm/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":
{"resource":"services","namespace":"openshift-cnv","name":"hco-webhook-service","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","reason":"AlreadyExists","code":409}, <<<<<
"requestReceivedTimestamp":"2022-01-28T06:21:38.782905Z","stageTimestamp":"2022-01-28T06:21:39.144566Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"olm-operator-binding-openshift-operator-lifecycle-manager\" of ClusterRole \"system:controller:operator-lifecycle-manager\" to ServiceAccount \"olm-operator-serviceaccount/openshift-operator-lifecycle-manager\""}}
~~~
This is a newly deployed cluster with no previous installation history of Openshift Virtualization. Also, it's possible to create the service with the same spec after the deployment.
The other webhook services are also created without any issues.
Version-Release number of selected component (if applicable):
kubevirt-hyperconverged-operator.v4.8.4
omg get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.17 True False 2h2m Cluster version is 4.8.17
How reproducible:
Always reproducible in the customer environment.
Steps to Reproduce:
1. Follow the https://access.redhat.com/documentation/en-us/openshift_container_platform/4.8/html-single/openshift_virtualization/index#installing-virt-cli
2. Deployment fails while creating kubevirt-hyperconverged.
3.
Actual results:
Installation of Openshift virtualization fails with error service "hco-webhook-service" not found
Expected results:
Installation of Openshift virtualization should work.
Additional info:
The original BZ contains extra information, but cannot be used because it was closed by ERRATA and the OLM team no longer users BZ to track issues.
- blocks
-
OCPBUGS-2535 Race condition deploying certs for managed webhooks
- Closed
- clones
-
OCPBUGS-2535 Race condition deploying certs for managed webhooks
- Closed
- duplicates
-
OCPBUGS-2535 Race condition deploying certs for managed webhooks
- Closed
- links to