-
Bug
-
Resolution: Done
-
Blocker
-
ACM 2.7.0
-
False
-
None
-
False
-
OCPSTRAT-618 - [GA] Self-managed Hosted Control Planes support for BM using the Agent Provider
-
-
-
AI-30
-
No
Description of the problem:
As part of the ecosystem QE testing on hypershift clusters on real baremetals, we tried to deploy several times with several environments an hypershift cluster with 1 or 2 workers.
while the cluster is created successfully, we never succeeded to arrive to a place where the ingress operator is functioning well, although we set up all the necessary dns entries (A records) - the route to the dns is never found, even if we can ping to it:
$ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE console 4.12.9 False False False 119m RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.registry.ocp-edge4.lab.eng.tlv2.redhat.com): Get "https://console-openshift-console.apps.registry.ocp-edge4.lab.eng.tlv2.redhat.com": dial tcp 10.46.29.134:443: connect: no route to host csi-snapshot-controller 4.12.9 True False False 3h16m dns 4.12.9 True False False 118m image-registry 4.12.9 True False False 119m ingress 4.12.9 True False True 9m59s The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)
Same error is shown when we are trying to deploy a cluster with metalLB and use one of the loadbalancer ips as the api point.
I'm opening this incident report to track after this effort and see if there might be something we are missing here.
I'm attaching a must-gather logs to see if something in the way ingress work on BM is different from kubevirt for example - when this is working as expected.
also, on libvirt it is working as well, so our thoughts is that the reason for the issue is that the workers and the hub are not in the exact same network - and if that is by design, or expected, will appreciate any comments. tnx.
Version-Release number of selected component:
- advanced-cluster-management.v2.7.3 - 2.7.3-DOWNSTREAM-2023-03-22-19-15-09
- 4.12.0-0.nightly-2023-03-21-173554{}
How reproducible:
100%
Steps to reproduce:
1. Deploy hub with 3 masters on real BM
2. Install ACM and hypershift operator
3. Deploy hypershift cluster with 1 or 2 real bm workers
must-gather: https://drive.google.com/drive/folders/1NH0Q5nmARQ5c9P-qTKd211p-38OiNsNY?usp=sharing
- links to
- mentioned in
-
Page Loading...