When utilizing the OSD "Edit Cluster Ingress" feature to change the default application router from public to private or vice versa, the external AWS load balancer is removed an replaced by the cloud-ingress-operator.
When this happens, the external load balancer health checks never receive a successful check from the backend nodes, and all nodes are marked out-of-service.
Cluster operators depending on *.apps.CLUSTERNAME.devshift.org begin to fail, initially with DNS errors, which is expected, but then with EOF messages attempting to get the routes associated with their health checks, eg:
OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.chcollin-mjtj.cvgo.s1.devshift.org/healthz": EOF
This always degrades the authentication, console and ingress (via ingress-canary) operators.
Logs from the `ovnkube-node-*` pods for the instance show VN properly updating the port for the endpoint healthcheck to that of the new port in use by the AWS LB.
The endpointSlices for the endpoint are updated/replaced, but with no change in config as far as I can tell. They're just recreated.
The service backending the router-default pods has the proper HealthCheckNodePort configuration, matching the new AWS LB.
Curling the service via the CLUSTER_IP:NODE_PORT_HEALTH_CHECK/healthz results in a connection time out.
Curling the local health check for HAPROXY within the router-default pod via `localhost:1936/healthz` results in an OK response as expected.
After rolling the router-default pods manually with `oc rollout restart deployment router-default -n openshift-ingress`, or just deleting the pods, the cluster ends up healing, with the AWS LB seeing the backend infra nodes in service again, and cluster operators depending on the *apps.CLUSTERNAME.devshift.org domain healing on their own as well.
I'm unsure if this should go to network-ovn or network-multis (or some other component), so I'm starting here. Please redirect me if necessary.