Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-31259

ipfailover VIP not reachable on OSP IPI

XMLWordPrintable

    • Moderate
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      allowed_address_pair prevents ipfailover deployment to work on OSP IPI clusters

      Version-Release number of selected component (if applicable):

      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.14.16   True        False         25m     Cluster version is 4.14.16

      How reproducible:

      100%

      Steps to Reproduce:

      1. configure ipfailover as documented [1]
      
      $ oc get deployment ipfailover-keepalived -o yaml|yq '.spec.template.spec.containers[0].env[]|select(.name=="OPENSHIFT_HA_VIRTUAL_IPS")' 
      name: OPENSHIFT_HA_VIRTUAL_IPS 
      value: 192.168.0.9
       
      2. verify one of the nodes becomes master
      
      $ oc logs ipfailover-keepalived-78c9cdb6f6-bb7lh |tail -3
      Thu Mar 21 10:43:12 2024: VRRP_Script(chk_ipfailover) succeeded
      Thu Mar 21 10:43:12 2024: (ipfailover_VIP_1) Entering BACKUP STATE
      Thu Mar 21 10:43:16 2024: (ipfailover_VIP_1) Entering MASTER STATE   
      
      $ oc get pod ipfailover-keepalived-78c9cdb6f6-bb7lh  -o wide 
      NAME                                     READY   STATUS    RESTARTS   AGE   IP              NODE                             NOMINATED NODE   READINESS GATES ipfailover-keepalived-78c9cdb6f6-bb7lh   1/1     Running   0          50s   192.168.0.216   mycluster-5pq95-worker-0-r52d8   <none>           <none>
      
      3. verify HA VIP address is not reachable
      
      $ ping 192.168.0.9
      PING 192.168.0.9 (192.168.0.9) 56(84) bytes of data.
      From 192.168.0.70 icmp_seq=1 Destination Host Unreachable
      From 192.168.0.70 icmp_seq=2 Destination Host Unreachable
      From 192.168.0.70 icmp_seq=3 Destination Host Unreachable     
      
      [1] https://docs.openshift.com/container-platform/4.14/networking/configuring-ipfailover.html#nw-ipfailover-configuration_configuring-ipfailover

      Actual results:

      The vip is assigned to node hosting the master pod:
      
      $ oc debug node/mycluster-5pq95-worker-0-r52d8
      <...>
      sh-4.4# chroot /host
      sh-5.1# toolbox 
      <...>
      [root@mycluster-5pq95-worker-0-r52d8 /]# ip a show  br-ex
      6: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 7950 qdisc noqueue state UNKNOWN group default qlen 1000
          link/ether fa:16:3e:31:f9:cb brd ff:ff:ff:ff:ff:ff
          inet 192.168.0.216/24 brd 192.168.0.255 scope global dynamic noprefixroute br-ex
             valid_lft 41977sec preferred_lft 41977sec
          inet 169.254.169.2/29 brd 169.254.169.7 scope global br-ex
             valid_lft forever preferred_lft forever
          inet 192.168.0.9/32 scope global br-ex
             valid_lft forever preferred_lft forever
          inet6 fe80::f816:3eff:fe31:f9cb/64 scope link noprefixroute 
             valid_lft forever preferred_lft forever
      
      During the ping, ARP request/replies are sent:
      
      [root@mycluster-5pq95-worker-0-r52d8 /]# tcpdump -nni br-ex arp
      dropped privs to tcpdump
      tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
      listening on br-ex, link-type EN10MB (Ethernet), snapshot length 262144 bytes
      10:50:10.710574 ARP, Request who-has 192.168.0.9 tell 192.168.0.70, length 28
      10:50:10.710589 ARP, Reply 192.168.0.9 is-at fa:16:3e:31:f9:cb, length 28
      10:50:11.735130 ARP, Request who-has 192.168.0.9 tell 192.168.0.70, length 28
      10:50:11.735141 ARP, Reply 192.168.0.9 is-at fa:16:3e:31:f9:cb, length 28
      10:50:12.759191 ARP, Request who-has 192.168.0.9 tell 192.168.0.70, length 28
      10:50:12.759202 ARP, Reply 192.168.0.9 is-at fa:16:3e:31:f9:cb, length 28
      
      On the client, no ARP responses:
      
      $ sudo tcpdump -i eth1 -nn arp
      dropped privs to tcpdump
      tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
      listening on eth1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
      10:47:29.011901 ARP, Request who-has 192.168.0.9 tell 192.168.0.70, length 28
      10:47:30.037515 ARP, Request who-has 192.168.0.9 tell 192.168.0.70, length 28
      10:47:31.061572 ARP, Request who-has 192.168.0.9 tell 192.168.0.70, length 28 
      
      Because only the API and Ingress VIP addresses are configured as allowed_address_pairs:
      
      $ openstack port list --server mycluster-5pq95-worker-0-r52d8
       +--------------------------------------+----------------------------------+-------------------+------------------------------------------------------------------------------+--------+ 
      | ID                                   | Name                             | MAC Address       | Fixed IP Addresses                                                           | Status | 
      +--------------------------------------+----------------------------------+-------------------+------------------------------------------------------------------------------+--------+ 
      | 8b050c6f-62ff-41a0-b2c9-6ece8be20ace | mycluster-5pq95-worker-0-r52d8-0 | fa:16:3e:31:f9:cb | ip_address='192.168.0.216', subnet_id='a09f2564-4fb8-4d04-88b7-a4f01b2367b0' | ACTIVE |
       +--------------------------------------+----------------------------------+-------------------+------------------------------------------------------------------------------+--------+
      
      $ openstack port show 8b050c6f-62ff-41a0-b2c9-6ece8be20ace -c allowed_address_pairs
       +-----------------------+-----------------------------------------------------------+ 
      | Field                 | Value                                                     |
       +-----------------------+-----------------------------------------------------------+ 
      | allowed_address_pairs | ip_address='192.168.0.5', mac_address='fa:16:3e:31:f9:cb' | |                       | ip_address='192.168.0.7', mac_address='fa:16:3e:31:f9:cb' |
       +-----------------------+-----------------------------------------------------------+

      Expected results:

      Once the failover VIP is added to the port's allowed_address_pairs, the VIP is reachable:
      
      
      $ openstack port set 8b050c6f-62ff-41a0-b2c9-6ece8be20ace --allowed-address ip-address=192.168.0.9,mac-address=fa:16:3e:31:f9:cb
      
      ...
      64 bytes from 192.168.0.9: icmp_seq=27 ttl=64 time=0.705 ms
      64 bytes from 192.168.0.9: icmp_seq=28 ttl=64 time=0.449 ms
      64 bytes from 192.168.0.9: icmp_seq=29 ttl=64 time=0.472 ms
      

      Additional info:

          

            rhn-gps-mbooth Matthew Booth
            rhn-support-bverschu Bram Verschueren
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: