Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 4.20.0
Affects Version/s: 4.19.0, 4.20.0
Component/s: HyperShift
Labels:

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
Yes

Target Backport Versions:

4.19.0
Target Version:

4.20.0
Release Blocker:
Approved
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
Done
Release Note Type:
Bug Fix
Release Note Text:

Hide
Before this update, the AWS Cloud Provider did not set the default ping target, `HTTP:10256/healthz`, for the AWS load balancer. For services of the `LoadBalancer` type that run on AWS, the load balancer object created in AWS had a ping target of `TCP:32518`. As a consequence, the health probes for cluster-wide services were non-functional, and during upgrades, those services were down. With this release, the `ClusterServiceLoadBalancerHealthProbeMode` property of the cloud configuration is set to `Shared`. This cloud configuration is passed to the AWS Cloud Provider. As a result, the load balancers in AWS have the correct health check ping target, `HTTP:10256/healthz`, which points to the `kube-proxy` health endpoints that are running on nodes.

Show
Before this update, the AWS Cloud Provider did not set the default ping target, `HTTP:10256/healthz`, for the AWS load balancer. For services of the `LoadBalancer` type that run on AWS, the load balancer object created in AWS had a ping target of `TCP:32518`. As a consequence, the health probes for cluster-wide services were non-functional, and during upgrades, those services were down. With this release, the `ClusterServiceLoadBalancerHealthProbeMode` property of the cloud configuration is set to `Shared`. This cloud configuration is passed to the AWS Cloud Provider. As a result, the load balancers in AWS have the correct health check ping target, `HTTP:10256/healthz`, which points to the `kube-proxy` health endpoints that are running on nodes.

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

As shown in this run of OpenShift conformance tests, the test " Cluster scoped load balancer healthcheck port and path should be 10256/healthz" fails:

{  fail [github.com/openshift/origin/test/extended/cloud_controller_manager/ccm.go:125]: Expected
    <string>: TCP:31611
to equal
    <string>: HTTP:10256/healthz
Ginkgo exit error 1: exit with code 1}

In AWS, LoadBalancer services are expected to create AWS LoadBalancers with:

Ping protocol: HTTP

Ping port: 10256

Ping path: healthz

These values match the kube-proxy running on each node. This change was brought in https://github.com/openshift/cluster-cloud-controller-manager-operator/pull/383 by settting the shared mode for ClusterServiceLoadBalancerHealthProbeMode config element. This change was also backported to 4.19 but not to 4.18. Version 4.18 can't use this flag so it sets the protocol/port in a different way.

HostedControlPlane doesn't configure the probe mode as can be seen here (CPO v2) and here(CPO v1) so it uses the default mode ServiceNodePort. The ClusterServiceLoadBalancerHealthProbeMode config element should be set here.

Link to slack discussion

blocks

OCPBUGS-59101 HCP operator should set Shared ClusterServiceLoadBalancerHealthProbeMode

Closed

is blocked by

CNTRLPLANE-1110 Impact: HCP operator should set Shared ClusterServiceLoadBalancerHealthProbeMode

Closed

is cloned by

OCPBUGS-59101 HCP operator should set Shared ClusterServiceLoadBalancerHealthProbeMode

Closed

relates to

OCPBUGS-62226 Cilium: LoadBalancer services unreachable

Verified

links to

openshift/hypershift#6099: OCPBUGS-56011: Configure ClusterServiceLoadBalancerHealthProbeMode a…

Assignee:: Martin Gencur

Reporter:: Martin Gencur

Need Info From:: None

Contributors:: None

QA Contact:: Martin Gencur

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 2025/04/30 10:35 AM

Updated:: 2025/10/21 4:26 AM

Resolved:: 2025/10/21 4:26 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide