-
Bug
-
Resolution: Done
-
Major
-
OSSM 2.2.2
Federation controller uses multiple informers to fetch Kubernetes objects, but it does not check HasSynced() on all of them. This causes race condition - informers may be invoked while they are not ready and then object processing fails.
Steps to reproduce for QE:
1. Deploy 2 service meshes.
2. Federate CA certificates and apply ServiceMeshPeers.
3. Restart istiod containers.
Istiods should start federation controllers successfully without errors like in the log below.
Original description:
The following error appears in istiod logs after istiod has been restarted:
# oc logs -f istiod-east-mesh-5bc7974588-xjxcg | grep -i "error processing" 2022-12-21T10:08:29.180591Z error federation Error processing remote-east-mesh-system/west-mesh (will retry): could not get root cert for mesh west-mesh: error getting configmap west-ca-root-cert in namespace remote-east-mesh-system: configmap "west-ca-root-cert" not found component=federation-discovery-controller 2022-12-21T10:08:29.186240Z error federation Error processing remote-east-mesh-system/west-mesh (will retry): could not get root cert for mesh west-mesh: error getting configmap west-ca-root-cert in namespace remote-east-mesh-system: configmap "west-ca-root-cert" not found component=federation-discovery-controller 2022-12-21T10:08:29.197127Z error federation Error processing remote-east-mesh-system/west-mesh (will retry): could not get root cert for mesh west-mesh: error getting configmap west-ca-root-cert in namespace remote-east-mesh-system: configmap "west-ca-root-cert" not found component=federation-discovery-controller 2022-12-21T10:08:29.217542Z error federation Error processing remote-east-mesh-system/west-mesh (will retry): could not get root cert for mesh west-mesh: error getting configmap west-ca-root-cert in namespace remote-east-mesh-system: configmap "west-ca-root-cert" not found component=federation-discovery-controller 2022-12-21T10:08:29.257752Z error federation Error processing remote-east-mesh-system/west-mesh (will retry): could not get root cert for mesh west-mesh: error getting configmap west-ca-root-cert in namespace remote-east-mesh-system: configmap "west-ca-root-cert" not found component=federation-discovery-controller 2022-12-21T10:08:29.337875Z error federation Error processing remote-east-mesh-system/west-mesh (giving up): could not get root cert for mesh west-mesh: error getting configmap west-ca-root-cert in namespace remote-east-mesh-system: configmap "west-ca-root-cert" not found component=federation-discovery-controller
Â
And shortly after that the communication with the other servicemeshpeer is interrupted. Â
The configmap exists, and the communication was working correctly to the other peer.Â
# oc version Client Version: 4.11.18 Kustomize Version: v4.5.4 Server Version: 4.11.20 Kubernetes Version: v1.24.6+5658434 # oc get smcp -A -o wide NAMESPACE         NAME     READY  STATUS       PROFILES    VERSION  AGE  IMAGE REGISTRY remote-east-mesh-system  east-mesh  10/10  ComponentsReady  ["default"]  2.2.4   45h # oc get csv NAME              DISPLAY                      VERSION   REPLACES           PHASE elasticsearch-operator.5.5.5  OpenShift Elasticsearch Operator         5.5.5                  Succeeded jaeger-operator.v1.39.0-3    Red Hat OpenShift distributed tracing platform  1.39.0-3  jaeger-operator.v1.34.1-5   Succeeded kiali-operator.v1.57.3     Kiali Operator                  1.57.3   kiali-operator.v1.48.3    Succeeded servicemeshoperator.v2.3.0   Red Hat OpenShift Service Mesh          2.3.0-0   servicemeshoperator.v2.2.3  Succeeded
Â
Attaching the must-gather from the mesh where the issue happened.Â
- is incorporated by
-
OSSM-2488 Container release for Maistra 2.2.5
- Closed
- mentioned on