-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.18
-
None
-
None
-
False
-
-
None
-
None
-
None
-
Production
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
The production cluster is experiencing recurring errors related to Red Hat Advanced Cluster Management (RHACM) webhook components. Logs from the degraded_webhook pod indicate TLS certificate verification failures due to an expired certificate, as well as connection refused errors when attempting to reach the webhook services.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
I first attempted to run the openssl s_client command as instructed:
echo -n | openssl s_client -connect
172.30.57.87
:443 -servername channels-apps-open-cluster-management-webhook-svc -showcerts
After deleting and recreating the pod (Step 2), I tried running the command again, but the connection remains stuck without providing any output or certificate details.
Next, I deleted the specified pod:
e-ceccarelli-mac:~ e.ceccarelli$ oc delete pod multicluster-operators-channel-796547f649-9x5dh -n open-cluster-management
pod "multicluster-operators-channel-796547f649-9x5dh" deleted
The pod was recreated successfully. However, the logs of the new pod are still showing the same continuous errors:
...
2025/11/25 07:46:10 http: TLS handshake error from
10.128.0.41
:51630: remote error: tls: bad certificate
2025/11/25 07:46:11 http: TLS handshake error from
10.128.0.41
:51646: remote error: tls: bad certificate
2025/11/25 07:46:13 http: TLS handshake error from
10.128.0.41
:40516: remote error: tls: bad certificate
...
2025/11/25 07:47:34 http: TLS handshake error from
10.128.0.41
:35904: remote error: tls: bad certificate
...
These continuous tls: bad certificate errors confirm that the certificate problem was not resolved by recreating the pod.
Finally, I performed the backup and deletion of the Validating Webhook configuration:
e-ceccarelli-mac:~ e.ceccarelli$ oc get validatingwebhookconfigurations channels.apps.open.cluster.management.webhook.validator -o yaml > backup.webhook.yaml
e-ceccarelli-mac:~ e.ceccarelli$ oc delete validatingwebhookconfigurations channels.apps.open.cluster.management.webhook.validator
validatingwebhookconfiguration.admissionregistration.k8s.io
"channels.apps.open.cluster.management.webhook.validator" deleted
Immediately after, I tried to run the verification command to check for the webhook's recreation and certificate date, but the command failed because the object was not found:
e-ceccarelli-mac:~ e.ceccarelli$ oc get validatingwebhookconfigurations channels.apps.open.cluster.management.webhook.validator -o json | jq -r '.webhooks[].clientConfig.caBundle' | base64 -d | openssl x509 -noout -dates
Error from server (NotFound):
validatingwebhookconfigurations.admissionregistration.k8s.io
"channels.apps.open.cluster.management.webhook.validator" not found
Could not find certificate from <stdin>
As you can see, the validatingwebhookconfigurations channels.apps.open.cluster.management.webhook.validator was not automatically recreated and is no longer present in the cluster.
The certificate issue continues to show up in the pod logs with the tls: bad certificate error, and additionally, the validatingwebhookconfigurations object
Actual results:
Expected results:
Additional info: