Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-24223

Deletion bug with respect to Kube-Proxy health checks

XMLWordPrintable

    • Low
    • No
    • CLOUD Sprint 246, CLOUD Sprint 247, CLOUD Sprint 248, CLOUD Sprint 249, CLOUD Sprint 250, CLOUD Sprint 251
    • 6
    • False
    • Hide

      None

      Show
      None

      These health checks were first added in https://github.com/kubernetes/cloud-provider-aws/pull/622 and pulled into OCP through https://github.com/openshift/cloud-provider-aws/pull/47

      The reproducing scenario:

      • Create a bunch of NLBs (let's say 5 of them NLB_NAME_1 --> NLB_NAME_5), resulting in one set of http://kubernetes.io/rule/nlb/health=${NLB_NAME_1} security group rules created per AZ against port 10256.
      • Delete NLB_NAME_1
      • All of the http://kubernetes.io/rule/nlb/health=${NLB_NAME_1} security group rules created per AZ against port 10256 will be deleted, resulting in the rest of the NLBs failing their health checks
      • We observed that the security group rules do reappear after 6 minutes and there wasn't huge impact because 
      If a target group contains only unhealthy registered targets, the load balancer routes requests to all those targets, regardless of their health status. This means that if all targets fail health checks at the same time in all enabled Availability Zones, the load balancer fails open. The effect of the fail-open is to allow traffic to all targets in all enabled Availability Zones, regardless of their health status, based on the load balancing algorithm. 

      https://docs.aws.amazon.com/elasticloadbalancing/latest/application/target-group-health-checks.html 

            rh-ee-tbarberb Theo Barber-Bany
            mshen.openshift Michael Shen
            Huali Liu Huali Liu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: