Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56011

HCP operator should set Shared ClusterServiceLoadBalancerHealthProbeMode

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • Yes
    • Approved
    • None
    • Proposed
    • Bug Fix
    • Hide
      *Cause*: The AWS Cloud Provider no longer sets the default ping target "HTTP:10256/healthz" for AWS Load Balancer. For Services of type LoadBalancer running on AWS, the Load Balancer object created in AWS now has a ping target "TCP:32518".
      *Consequence*: The health probes for cluster wide services are basically non-functional and during upgrades, these services observe downtime.
      *Fix*: Configure the ClusterServiceLoadBalancerHealthProbeMode property of the cloud config to be "Shared". This cloud config is passed to the AWS Cloud Provider.
      *Result*: Load Balancers in AWS have correct health check ping target HTTP:10256/healthz which points to kube-proxy health endpoints running on nodes.
      Show
      *Cause*: The AWS Cloud Provider no longer sets the default ping target "HTTP:10256/healthz" for AWS Load Balancer. For Services of type LoadBalancer running on AWS, the Load Balancer object created in AWS now has a ping target "TCP:32518". *Consequence*: The health probes for cluster wide services are basically non-functional and during upgrades, these services observe downtime. *Fix*: Configure the ClusterServiceLoadBalancerHealthProbeMode property of the cloud config to be "Shared". This cloud config is passed to the AWS Cloud Provider. *Result*: Load Balancers in AWS have correct health check ping target HTTP:10256/healthz which points to kube-proxy health endpoints running on nodes.
    • None
    • None
    • None
    • None

      As shown in this run of OpenShift conformance tests, the test " Cluster scoped load balancer healthcheck port and path should be 10256/healthz" fails:

      {  fail [github.com/openshift/origin/test/extended/cloud_controller_manager/ccm.go:125]: Expected
          <string>: TCP:31611
      to equal
          <string>: HTTP:10256/healthz
      Ginkgo exit error 1: exit with code 1} 

      In AWS, LoadBalancer services are expected to create AWS LoadBalancers with:

      Ping protocol: HTTP

      Ping port: 10256

      Ping path: healthz

      These values match the kube-proxy running on each node. This change was brought in https://github.com/openshift/cluster-cloud-controller-manager-operator/pull/383 by settting the shared mode for ClusterServiceLoadBalancerHealthProbeMode config element. This change was also backported to 4.19 but not to 4.18. Version 4.18 can't use this flag so it sets the protocol/port in a different way.

      HostedControlPlane doesn't configure the probe mode as can be seen here (CPO v2) and here(CPO v1) so it uses the default mode ServiceNodePort. The ClusterServiceLoadBalancerHealthProbeMode config element should be set here.

      Link to slack discussion

              mgencur@redhat.com Martin Gencur
              mgencur@redhat.com Martin Gencur
              None
              None
              Martin Gencur Martin Gencur
              None
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: