Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-23192

[Azure][RHEL Worker] LB service is pending after scaling up rhel worker

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • Yes
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      After installing Azure cluster and scaling up two rhel workers, then created custom ingresscontroller, the LB service stuck in pending status, and report error below:
      
      Error syncing load balancer: failed to ensure load balancer: failed to map VM Name to NodeName: VM Name ci-op-mk6yqx4d-eed09-5rvxk-rhel-2

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-11-11-054014

      How reproducible:

      100%

      Steps to Reproduce:

      1. use workflow cucushift-installer-rehearse-azure-ipi-proxy-workers-rhcos-rhel8 to create cluster, ensure the new rhel workers is joined
      
      2. create custom ingresscontroller, e.g
      
      BASE_DOMAIN=$(oc get dns.config cluster -ojsonpath={.spec.baseDomain})
      
      kind: IngressController
      apiVersion: operator.openshift.io/v1
      metadata:
        name: extlb
        namespace: openshift-ingress-operator
      spec:
        domain: extlb.$BASE_DOMAIN
        replicas: 1
        endpointPublishingStrategy:
          loadBalancer:
            scope: External
          type: LoadBalancerService
      
      3. check the LB service status
       
      

      Actual results:

      1. rhel worker nodes are added
      
      $ oc get node
      NAME                                                     STATUS   ROLES                  AGE    VERSION
      ci-op-mk6yqx4d-eed09-5rvxk-master-0                      Ready    control-plane,master   105m   v1.27.6+b49f9d1
      ci-op-mk6yqx4d-eed09-5rvxk-master-1                      Ready    control-plane,master   105m   v1.27.6+b49f9d1
      ci-op-mk6yqx4d-eed09-5rvxk-master-2                      Ready    control-plane,master   105m   v1.27.6+b49f9d1
      ci-op-mk6yqx4d-eed09-5rvxk-rhel-1                        Ready    worker                 44m    v1.27.6+b49f9d1
      ci-op-mk6yqx4d-eed09-5rvxk-rhel-2                        Ready    worker                 44m    v1.27.6+b49f9d1
      ci-op-mk6yqx4d-eed09-5rvxk-worker-northcentralus-ct6zl   Ready    worker                 90m    v1.27.6+b49f9d1
      ci-op-mk6yqx4d-eed09-5rvxk-worker-northcentralus-dzgtx   Ready    worker                 90m    v1.27.6+b49f9d1
      ci-op-mk6yqx4d-eed09-5rvxk-worker-northcentralus-h67d6   Ready    worker                 87m    v1.27.6+b49f9d1
      
      3. the LB service status
      $ oc -n openshift-ingress get svc
      NAME                      TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)                      AGE
      router-default            LoadBalancer   172.30.27.27     20.98.51.124   80:30330/TCP,443:30958/TCP   93m
      router-extlb              LoadBalancer   172.30.129.142   <pending>      80:32027/TCP,443:32591/TCP   29s
      
      $ oc describe svc router-extlb
      <......>
        Warning  SyncLoadBalancerFailed  3m25s (x7 over 8m45s)  service-controller  Error syncing load balancer: failed to ensure load balancer: failed to map VM Name to NodeName: VM Name ci-op-mk6yqx4d-eed09-5rvxk-rhel-2
      
      

      Expected results:

      the LB should be provisioned after rhel worker added

      Additional info:

      no issue without scaling up rhel worker nodes.

              rh-ee-nbrubake Nolan Brubaker
              rhn-support-hongli Hongan Li
              None
              None
              Zhaohua Sun Zhaohua Sun
              None
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: