Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-10526

EgressIP doesn't work in GCP XPN cluster

    XMLWordPrintable

Details

    • No
    • SDN Sprint 234
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      
      

      Version-Release number of selected component (if applicable):

       4.13.0-0.nightly-2023-03-17-161027 
      
      

      How reproducible:

      Always
      
      

      Steps to Reproduce:

      1.  Create a GCP XPN cluster with flexy job template ipi-on-gcp/versioned-installer-xpn-ci, then 'oc descirbe node'
      
      2. Check logs for cloud-network-config-controller pods
      
      

      Actual results:

      
       % oc get nodes
      NAME                                                          STATUS   ROLES                  AGE    VERSION
      huirwang-0309d-r85mj-master-0.c.openshift-qe.internal         Ready    control-plane,master   173m   v1.26.2+06e8c46
      huirwang-0309d-r85mj-master-1.c.openshift-qe.internal         Ready    control-plane,master   173m   v1.26.2+06e8c46
      huirwang-0309d-r85mj-master-2.c.openshift-qe.internal         Ready    control-plane,master   173m   v1.26.2+06e8c46
      huirwang-0309d-r85mj-worker-a-wsrls.c.openshift-qe.internal   Ready    worker                 162m   v1.26.2+06e8c46
      huirwang-0309d-r85mj-worker-b-5txgq.c.openshift-qe.internal   Ready    worker                 162m   v1.26.2+06e8c46
       `oc describe node`, there is no related egressIP annotations 
      % oc describe node huirwang-0309d-r85mj-worker-a-wsrls.c.openshift-qe.internal 
      Name:               huirwang-0309d-r85mj-worker-a-wsrls.c.openshift-qe.internal
      Roles:              worker
      Labels:             beta.kubernetes.io/arch=amd64
                          beta.kubernetes.io/instance-type=n2-standard-4
                          beta.kubernetes.io/os=linux
                          failure-domain.beta.kubernetes.io/region=us-central1
                          failure-domain.beta.kubernetes.io/zone=us-central1-a
                          kubernetes.io/arch=amd64
                          kubernetes.io/hostname=huirwang-0309d-r85mj-worker-a-wsrls.c.openshift-qe.internal
                          kubernetes.io/os=linux
                          machine.openshift.io/interruptible-instance=
                          node-role.kubernetes.io/worker=
                          node.kubernetes.io/instance-type=n2-standard-4
                          node.openshift.io/os_id=rhcos
                          topology.gke.io/zone=us-central1-a
                          topology.kubernetes.io/region=us-central1
                          topology.kubernetes.io/zone=us-central1-a
      Annotations:        csi.volume.kubernetes.io/nodeid:
                            {"pd.csi.storage.gke.io":"projects/openshift-qe/zones/us-central1-a/instances/huirwang-0309d-r85mj-worker-a-wsrls"}
                          k8s.ovn.org/host-addresses: ["10.0.32.117"]
                          k8s.ovn.org/l3-gateway-config:
                            {"default":{"mode":"shared","interface-id":"br-ex_huirwang-0309d-r85mj-worker-a-wsrls.c.openshift-qe.internal","mac-address":"42:01:0a:00:...
                          k8s.ovn.org/node-chassis-id: 7fb1870c-4315-4dcb-910c-0f45c71ad6d3
                          k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv4":"100.64.0.5/16"}
                          k8s.ovn.org/node-mgmt-port-mac-address: 16:52:e3:8c:13:e2
                          k8s.ovn.org/node-primary-ifaddr: {"ipv4":"10.0.32.117/32"}
                          k8s.ovn.org/node-subnets: {"default":["10.131.0.0/23"]}
                          machine.openshift.io/machine: openshift-machine-api/huirwang-0309d-r85mj-worker-a-wsrls
                          machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
                          machineconfiguration.openshift.io/currentConfig: rendered-worker-bec5065070ded51e002c566a9c5bd16a
                          machineconfiguration.openshift.io/desiredConfig: rendered-worker-bec5065070ded51e002c566a9c5bd16a
                          machineconfiguration.openshift.io/desiredDrain: uncordon-rendered-worker-bec5065070ded51e002c566a9c5bd16a
                          machineconfiguration.openshift.io/lastAppliedDrain: uncordon-rendered-worker-bec5065070ded51e002c566a9c5bd16a
                          machineconfiguration.openshift.io/reason: 
                          machineconfiguration.openshift.io/state: Done
                          volumes.kubernetes.io/controller-managed-attach-detach: true
      
      
       % oc logs cloud-network-config-controller-5cd96d477d-2kmc9  -n openshift-cloud-network-config-controller  
      W0320 03:00:08.981493       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
      I0320 03:00:08.982280       1 leaderelection.go:248] attempting to acquire leader lease openshift-cloud-network-config-controller/cloud-network-config-controller-lock...
      E0320 03:00:38.982868       1 leaderelection.go:330] error retrieving resource lock openshift-cloud-network-config-controller/cloud-network-config-controller-lock: Get "https://api-int.huirwang-0309d.qe.gcp.devcluster.openshift.com:6443/api/v1/namespaces/openshift-cloud-network-config-controller/configmaps/cloud-network-config-controller-lock": dial tcp: lookup api-int.huirwang-0309d.qe.gcp.devcluster.openshift.com: i/o timeout
      E0320 03:01:23.863454       1 leaderelection.go:330] error retrieving resource lock openshift-cloud-network-config-controller/cloud-network-config-controller-lock: Get "https://api-int.huirwang-0309d.qe.gcp.devcluster.openshift.com:6443/api/v1/namespaces/openshift-cloud-network-config-controller/configmaps/cloud-network-config-controller-lock": dial tcp: lookup api-int.huirwang-0309d.qe.gcp.devcluster.openshift.com on 172.30.0.10:53: read udp 10.129.0.14:52109->172.30.0.10:53: read: connection refused
      I0320 03:02:19.249359       1 leaderelection.go:258] successfully acquired lease openshift-cloud-network-config-controller/cloud-network-config-controller-lock
      I0320 03:02:19.250662       1 controller.go:88] Starting node controller
      I0320 03:02:19.250681       1 controller.go:91] Waiting for informer caches to sync for node workqueue
      I0320 03:02:19.250693       1 controller.go:88] Starting secret controller
      I0320 03:02:19.250703       1 controller.go:91] Waiting for informer caches to sync for secret workqueue
      I0320 03:02:19.250709       1 controller.go:88] Starting cloud-private-ip-config controller
      I0320 03:02:19.250715       1 controller.go:91] Waiting for informer caches to sync for cloud-private-ip-config workqueue
      I0320 03:02:19.258642       1 controller.go:182] Assigning key: huirwang-0309d-r85mj-master-2.c.openshift-qe.internal to node workqueue
      I0320 03:02:19.258671       1 controller.go:182] Assigning key: huirwang-0309d-r85mj-master-1.c.openshift-qe.internal to node workqueue
      I0320 03:02:19.258682       1 controller.go:182] Assigning key: huirwang-0309d-r85mj-master-0.c.openshift-qe.internal to node workqueue
      I0320 03:02:19.351258       1 controller.go:96] Starting node workers
      I0320 03:02:19.351303       1 controller.go:102] Started node workers
      I0320 03:02:19.351298       1 controller.go:96] Starting secret workers
      I0320 03:02:19.351331       1 controller.go:102] Started secret workers
      I0320 03:02:19.351265       1 controller.go:96] Starting cloud-private-ip-config workers
      I0320 03:02:19.351508       1 controller.go:102] Started cloud-private-ip-config workers
      E0320 03:02:19.589704       1 controller.go:165] error syncing 'huirwang-0309d-r85mj-master-1.c.openshift-qe.internal': error retrieving the private IP configuration for node: huirwang-0309d-r85mj-master-1.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 404: The resource 'projects/openshift-qe/regions/us-central1/subnetworks/installer-shared-vpc-subnet-1' was not found, notFound, requeuing in node workqueue
      E0320 03:02:19.615551       1 controller.go:165] error syncing 'huirwang-0309d-r85mj-master-0.c.openshift-qe.internal': error retrieving the private IP configuration for node: huirwang-0309d-r85mj-master-0.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 404: The resource 'projects/openshift-qe/regions/us-central1/subnetworks/installer-shared-vpc-subnet-1' was not found, notFound, requeuing in node workqueue
      E0320 03:02:19.644628       1 controller.go:165] error syncing 'huirwang-0309d-r85mj-master-2.c.openshift-qe.internal': error retrieving the private IP configuration for node: huirwang-0309d-r85mj-master-2.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 404: The resource 'projects/openshift-qe/regions/us-central1/subnetworks/installer-shared-vpc-subnet-1' was not found, notFound, requeuing in node workqueue
      E0320 03:02:19.774047       1 controller.go:165] error syncing 'huirwang-0309d-r85mj-master-0.c.openshift-qe.internal': error retrieving the private IP configuration for node: huirwang-0309d-r85mj-master-0.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 404: The resource 'projects/openshift-qe/regions/us-central1/subnetworks/installer-shared-vpc-subnet-1' was not found, notFound, requeuing in node workqueue
      E0320 03:02:19.783309       1 controller.go:165] error syncing 'huirwang-0309d-r85mj-master-1.c.openshift-qe.internal': error retrieving the private IP configuration for node: huirwang-0309d-r85mj-master-1.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 404: The resource 'projects/openshift-qe/regions/us-central1/subnetworks/installer-shared-vpc-subnet-1' was not found, notFound, requeuing in node workqueue
      E0320 03:02:19.816430       1 controller.go:165] error syncing 'huirwang-0309d-r85mj-master-2.c.openshift-qe.internal': error retrieving the private IP configuration for node: huirwang-0309d-r85mj-master-2.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 404: The resource 'projects/openshift-qe/regions/us-central1/subnetworks/installer-shared-vpc-subnet-1' was not found, notFound, requeuing in node workqueue
      
      

      Expected results:

      EgressIP should work
      

      Additional info:

      It can be reproduced in  4.12 as well, not regression issue.
      
      

      Attachments

        Issue Links

          Activity

            People

              jluhrsen Jamo Luhrsen
              huirwang Huiran Wang
              Huiran Wang Huiran Wang
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: