Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-61039

uneven distribution of kube api traffic

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 4.18.0
    • 4.18.z, 4.19.z, 4.20.0
    • kube-apiserver
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • Rejected
    • None
    • Done
    • Enhancement
    • Hide
      Previously, kube-apiserver was configured with `goaway-chance` set to 0, meaning kube-apiserver never sent GOAWAY frame and connections were never reset and we reusing the same kube-apiserver instance.
      Now, in order to rebalance connections between kube-apiserver instances, apiserver is configured to send GOAWAY frame with the probability of 1% after every frame. This ensures that connection can be reset and reconnected to a different kube-apiserver, improving platform stability and avoids cache bloat on particular kube-apiserver instances.

      This change has no effect on single node installations
      Show
      Previously, kube-apiserver was configured with `goaway-chance` set to 0, meaning kube-apiserver never sent GOAWAY frame and connections were never reset and we reusing the same kube-apiserver instance. Now, in order to rebalance connections between kube-apiserver instances, apiserver is configured to send GOAWAY frame with the probability of 1% after every frame. This ensures that connection can be reset and reconnected to a different kube-apiserver, improving platform stability and avoids cache bloat on particular kube-apiserver instances. This change has no effect on single node installations
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-60121. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-43521. The following is the description of the original issue:

      When master nodes or kube api servers are taken offline, for mc update, revision rollout, etc .. a single kube api server will contain the majority of long lived connection and live connections.
      After all 3 masters are back online a single kube-apiserver will continue to receive the majority of live api connections resulting in the master node cpu hitting 100%

      Restarting the kube api server pod resolves the issue.

      The expectation is that after the 3 masters are up the live api connections would get balanced between the 3 master nodes.

      Looking for assistance in determining why the live connections are not getting evenly distributed between the kube api servers whenever quorum is reestablished.

              vrutkovs@redhat.com Vadim Rutkovsky
              rhn-support-dseals Daniel Seals
              None
              None
              Ke Wang Ke Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: