Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-55217

Optimistically update Kube Server and Client CA bundles

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Approved
    • None
    • Done
    • Bug Fix
    • Hide
      Before this release, two identical copies of the same controller were updating the same Certificate Authority (CA) bundle in a `configmap` causing them to receive different metadata inputs, rewrite each other's changes, and create duplicate events. With this release, the controllers use optimistic updating and server-side apply to avoid update events and handle update conflicts. As a result, metadata updates no longer trigger duplicate events, and the expected metadata is set correctly. (link:https://issues.redhat.com/browse/OCPBUGS-55217[OCPBUGS-55217])
      Show
      Before this release, two identical copies of the same controller were updating the same Certificate Authority (CA) bundle in a `configmap` causing them to receive different metadata inputs, rewrite each other's changes, and create duplicate events. With this release, the controllers use optimistic updating and server-side apply to avoid update events and handle update conflicts. As a result, metadata updates no longer trigger duplicate events, and the expected metadata is set correctly. (link: https://issues.redhat.com/browse/OCPBUGS-55217 [ OCPBUGS-55217 ])
    • None
    • None
    • None
    • None

      Component Readiness has found a potential regression in the following test:

      [sig-arch] events should not repeat pathologically

      Significant regression detected.
      Fishers Exact probability of a regression: 100.00%.
      Test pass rate dropped from 100.00% to 92.74%.

      Sample (being evaluated) Release: 4.20
      Start Time: 2025-08-06T00:00:00Z
      End Time: 2025-08-13T16:00:00Z
      Success Rate: 92.74%
      Successes: 281
      Failures: 22
      Flakes: 0
      Base (historical) Release: 4.19
      Start Time: 2025-05-18T00:00:00Z
      End Time: 2025-06-17T23:59:59Z
      Success Rate: 100.00%
      Successes: 981
      Failures: 0
      Flakes: 0

      View the test details report for additional context.

      The first 5 failures I looked at all look to be the same cause:

      [sig-arch] events should not repeat pathologically expand_less 	0s
      {  1 events happened too frequently
      
      event happened 115 times, something is wrong: node/ip-10-0-117-116.ec2.internal hmsg/a4cafcb105 - reason/ConfigMapUpdateFailed Failed to update ConfigMap/csr-controller-ca -n openshift-config-managed: Operation cannot be fulfilled on configmaps "csr-controller-ca": the object has been modified; please apply your changes to the latest version and try again (12:43:31Z) result=reject }
      

      This test is intended to catch excessive event spamming and load on etcd, the limit is 20 identical events, and here we're seeing hundreds. This is caused by different copies of cert rotation controller rewriting each other changes.

      Filed by: dgoodwin@redhat.com

              vrutkovs@redhat.com Vadim Rutkovsky (Inactive)
              vrutkovs@redhat.com Vadim Rutkovsky (Inactive)
              None
              None
              Rahul Gangwar Rahul Gangwar
              None
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

                Created:
                Updated:
                Resolved: