Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-9296

cni-podman0 interface hardcoded network range can overlap cluster node IPs

    XMLWordPrintable

Details

    • Moderate
    • Unspecified
    • If docs needed, set a value

    Description

      OCP Version: 4.8.36

      Description of problem:
      A customer with an already installed and running production cluster has OpenShift nodes with 10.88.x.x IP addresses. When they ran a test container via podman to run an fio disk test, podman created a cni-podman0 interface with a 10.88.0.0/16 route on the node. Since they ran this against all the control plane nodes, the control plane nodes lost connection with the infra nodes as a result. An investigation revealed this interface's network range within a hardcoded configuration: see /etc/cni/net.d/87-podman-bridge.conflist

      The interface persisted after the container exited and was removed.

      To reproduce:
      $ for master in $( oc get nodes -l node-role.kubernetes.io/master -oname | awk -F/ '

      {print $2}

      ' ) ; do echo $master ; oc debug node/$master – chroot /host podman run --volume /var/lib/etcd:/var/lib/etcd:Z quay.io/openshift-scale/etcd-perf &>/tmp/${master}_fio.lst ; done

      master-0.odile.redhat.com
      master-1.odile.redhat.com
      master-2.odile.redhat.com

      $ oc debug node/master-0.odile.redhat.com
      Starting pod/master-0odileredhatcom-debug ...
      To use host binaries, run `chroot /host`
      Pod IP: 10.0.93.98
      If you don't see a command prompt, try pressing enter.
      sh-4.4# chroot /host
      sh-4.4# netstat -rn
      Kernel IP routing table
      Destination Gateway Genmask Flags MSS Window irtt Iface
      0.0.0.0 10.0.95.254 0.0.0.0 UG 0 0 0 br-ex
      10.0.88.0 0.0.0.0 255.255.248.0 U 0 0 0 br-ex
      10.88.0.0 0.0.0.0 255.255.0.0 U 0 0 0 cni-podman0
      10.128.0.0 0.0.0.0 255.255.254.0 U 0 0 0 ovn-k8s-mp0
      10.128.0.0 10.128.0.1 255.252.0.0 UG 0 0 0 ovn-k8s-mp0
      169.254.169.0 10.0.95.254 255.255.255.252 UG 0 0 0 br-ex
      169.254.169.3 10.128.0.1 255.255.255.255 UGH 0 0 0 ovn-k8s-mp0
      169.254.169.254 10.0.88.10 255.255.255.255 UGH 0 0 0 br-ex
      172.30.0.0 10.0.95.254 255.255.0.0 UG 0 0 0 br-ex

      If the customer has OCP nodes with an IP address that falls within 10.88.0.0/16, they will no longer be able to communicate once the cni-podman0 interface is created.

      Additional info:
      I looked in the documentation to see if there were any statements/warnings about not using 10.88/16 for cluster nodes, but was not able to find anything. I was also not able to find if changing the interface's configuration would be supported or persist if altered by a customer on an OpenShift cluster.

      This BZ [0] seems related, although my customer did not encounter this issue during installation only on an already running cluster when following steps [1] run run fio test.

      [0] https://bugzilla.redhat.com/show_bug.cgi?id=1723798
      [1] https://access.redhat.com/solutions/4885641

      Attachments

        Activity

          People

            tsweeney@redhat.com Tom Sweeney
            rhn-support-jacrawfo James Crawford (Inactive)
            Red Hat Employee
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: