-
Bug
-
Resolution: Done
-
Major
-
None
-
4.11
-
None
-
None
-
3
-
OTA 229, OTA 230, OTA 231
-
3
-
Rejected
-
False
-
I am using OCP 4.11.0:
$ oc version Client Version: 4.10.25 Server Version: 4.11.0 Kubernetes Version: v1.24.0+9546431
I added my private CA certificate to the CA bundle as per documentation: Updating the CA bundle
After that I can see an intermittent error:
$ oc get kubeapiservers.operator.openshift.io cluster -o yaml … lastTransitionTime: "2022-08-27T20:32:39Z" message: "alertmanagerconfigs.monitoring.coreos.com: x509: certificate signed by unknown authority" reason: WebhookServiceConnectionError status: "True" type: CRDConversionWebhookConfigurationError …
In the Kubernetes audit logs, I can see that two controllers (cluster-version-operator and service-ca) are overwriting each other's changes to the alertmanagerconfigs.monitoring.coreos.com crd:
$ oc get crd alertmanagerconfigs.monitoring.coreos.com apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: … name: alertmanagerconfigs.monitoring.coreos.com … spec: conversion: strategy: Webhook webhook: clientConfig: caBundle: LS0tLS1CRUdJTi … service: …
The service-ca controller adds the caBundle field to the crd resource. The cluster-version-operator removes it. This continues periodically.
I reviewed the crd definition from inside of the cluster-version-operator container:
$ oc rsh -n openshift-cluster-version cluster-version-operator-796d5bc86b-52qjw $ cat /release-manifests/0000_50_cluster-monitoring-operator_00_0alertmanager-config-custom-resource-definition.yaml apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: controller-gen.kubebuilder.io/version: v0.8.0 include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" service.beta.openshift.io/inject-cabundle: "true" creationTimestamp: null name: alertmanagerconfigs.monitoring.coreos.com spec: conversion: strategy: Webhook webhook: clientConfig: service: name: prometheus-operator-admission-webhook namespace: openshift-monitoring path: /convert port: 8443 conversionReviewVersions: - v1beta1 - v1alpha1 group: monitoring.coreos.com names: …
The definition above includes the webhook configuration fields. This is probably the reason why the cluster-version-operator overwrites the changes made by the service-ca controller.
Note that I filed a similar bug report here: https://issues.redhat.com/browse/PSAP-889