Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-12918

BZ#2271738 Operation not supported when removing IP address RHEL9 + RHOSP17.1

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • python-pyroute2
    • False
    • Hide

      None

      Show
      None
    • False
    • Committed
    • Committed
    • Committed
    • None

      +++ This bug was initially created as a clone of Bug #2094986 +++

      Description of problem:
      when running neutron metadata agent from OpenStack, using the latest kernel from rawhide, pyroute2 fails to remove an IP address with "Operation not supported". Downgrading to original kernel, or to RHEL9 kernel, works fine.

      Version-Release number of selected component (if applicable):
      kernel-core-5.19.0-0.rc0.20220603git50fd82b3a9a9.11.fc37.x86_64
      openvswitch-2.17.0-4.fc37.x86_64
      python3-pyroute2-0.6.9-1.fc37.noarch

      How reproducible:
      all the time

      Steps to Reproduce:
      1. create an interface in its own namespace

      1. in a fresh VM based on Fedora-Cloud-Base-Rawhide-20220605.n.0.x86_64.qcow2
        dnf install openvswitch python3-pyroute2
        systemctl start openvswitch
        ovs-vsctl add-br br0
        ovs-vsctl add-port br0 p1 – set Interface p1 type=internal

      ip netns add ns
      ip link set p1 netns ns
      ip netns exec ns ip link set p1 up

      2. get its link local IPv6 IP
      ip netns exec ns ip a
      1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      5: p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether 32:f5:52:cf:f8:97 brd ff:ff:ff:ff:ff:ff
      inet6 fe80::30f5:52ff:fecf:f897/64 scope link
      valid_lft forever preferred_lft forever

      3.try to remove it with pyroute2 (inspired by Neutron: https://github.com/openstack/neutron/blob/master/neutron/privileged/agent/linux/ip_lib.py)

      import pyroute2
      import socket
      def get_iproute(namespace):

      1. From iproute.py:
      2. `IPRoute` – RTNL API to the current network namespace
      3. `NetNS` – RTNL API to another network namespace
        if namespace:
      4. do not try and create the namespace
        return pyroute2.NetNS(namespace, flags=0)
        else:
        return pyroute2.IPRoute()

      def get_link_id(device, namespace, raise_exception=True):
      with get_iproute(namespace) as ip:
      link_id = ip.link_lookup(ifname=device)
      return link_id[0]

      def _run_iproute_addr(command, device, namespace, **kwargs):
      with get_iproute(namespace) as ip:
      idx = get_link_id(device, namespace)
      return ip.addr(command, index=idx, **kwargs)

      _IP_VERSION_FAMILY_MAP =

      {4: socket.AF_INET, 6: socket.AF_INET6}

      def delete_ip_address(ip_version, ip, prefixlen, device, namespace):
      family = _IP_VERSION_FAMILY_MAP[ip_version]
      _run_iproute_addr("delete",
      device,
      namespace,
      address=ip,
      mask=prefixlen,
      family=family)

      delete_ip_address(4, "fe80::30f5:52ff:fecf:f897", 64, "p1", "ns")

      Actual results:
      >>> delete_ip_address(4, "fe80::30f5:52ff:fecf:f897", 64, "p1", "ns")
      Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in delete_ip_address
      File "<stdin>", line 4, in _run_iproute_addr
      File "/usr/lib/python3.10/site-packages/pr2modules/iproute/linux.py", line 1635, in addr
      ret = self.nlm_request(
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 403, in nlm_request
      return tuple(self._genlm_request(*argv, **kwarg))
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 908, in nlm_request
      for msg in self.get(
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 406, in get
      return tuple(self._genlm_get(*argv, **kwarg))
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 734, in get
      raise msg['header']['error']
      pr2modules.netlink.exceptions.NetlinkError: (95, 'Operation not supported')

      Expected results:
      same as what you have with the ip command:
      ip netns exec ns ip addr del fe80::30f5:52ff:fecf:f897/64 dev p1
      ip netns exec ns ip a
      1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      5: p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
      link/ether 32:f5:52:cf:f8:97 brd ff:ff:ff:ff:ff:ff

      Additional info:

      — Additional comment from François Rigault on 2022-06-08 19:39:45 UTC —

      ... of course same error with the correct IP address family...

      >>> delete_ip_address(6, "fe80::30f5:52ff:fecf:f897", 64, "p1", "ns")
      Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in delete_ip_address
      File "<stdin>", line 4, in _run_iproute_addr
      File "/usr/lib/python3.10/site-packages/pr2modules/iproute/linux.py", line 1635, in addr
      ret = self.nlm_request(
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 403, in nlm_request
      return tuple(self._genlm_request(*argv, **kwarg))
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 908, in nlm_request
      for msg in self.get(
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 406, in get
      return tuple(self._genlm_get(*argv, **kwarg))
      File "/usr/lib/python3.10/site-packages/pr2modules/netlink/nlsocket.py", line 734, in get
      raise msg['header']['error']
      pr2modules.netlink.exceptions.NetlinkError: (95, 'Operation not supported')

      if we move back to fedora36:
      sudo dnf install --setopt releasever=36 --enablerepo fedora kernel-core-5.17.5-300.fc36

      then delete_ip_address(6, "fe80::30f5:52ff:fecf:f897", 64, "p1", "ns") works fine
      and delete_ip_address(4, "fe80::30f5:52ff:fecf:f897", 64, "p1", "ns") (with the incorrect IP family) gives a "Cannot assign requested address", not an "Operation not supported").

      — Additional comment from Peter V. Saveliev on 2022-06-08 22:15:35 UTC —

      Being investigated

      — Additional comment from François Rigault on 2022-06-08 22:37:59 UTC —

      I've been trying to bisect using kernels I found under https://bodhi.fedoraproject.org/updates/?packages=kernel&page=1

      last good version is 5.19.0-0.rc0.20220525gitfdaf9a5840ac.2.fc37.x86_64
      first bad version is 5.19.0-0.rc0.20220526gitbabf0bb978e3.4.fc37.x86_64

      last tested (bad) version: kernel-core-5.19.0-0.rc0.20220603git50fd82b3a9a9.11.fc37.x86_64

      — Additional comment from Peter V. Saveliev on 2022-06-08 22:45:58 UTC —

      Confirmed for pyroute2 0.6.9, looks like already fixed in 0.6.11, checking, ETA 09.06.22

      — Additional comment from Peter V. Saveliev on 2022-06-08 22:52:23 UTC —

      NB: kernel isn't the cause by itself — most probably the issue was in the NLA packing in pyroute2 prior to 0.6.10, full report tomorrow

      — Additional comment from Peter V. Saveliev on 2022-06-08 23:24:23 UTC —

      The fix commit: https://github.com/svinota/pyroute2/commit/1eb08312de30a083bcfddfaa9c1d5e124b6368df

      The issue was in the message flags during delete requests, affected not only addresses.

      Fixed in the upstream starting from 0.6.10-0.6.11

      — Additional comment from François Rigault on 2022-06-09 08:32:35 UTC —

      indeed, in fact you already described this problem in the commit.

      so kernel 5.19 requires pyroute2 >= 0.6.10 (for some operations). In my case pyroute2 is provided in a container image, it is not updated at the same pace as the kernel. I guess it's the burden of distro maintainers to not upgrade the kernel without fixing any dependency issue before..

      for the purpose of this bz, maybe it can stay open until a 0.6.10 or 11 makes it into Rawhide, I'll be happy to test and close.

      Thank you for your work on pyroute2.

      — Additional comment from Peter V. Saveliev on 2022-06-09 09:41:04 UTC —

      I would recommend to skip 0.6.10 and get any version >= 0.6.11 (whichever will be available at the time), as 0.6.10 contains an issue in a component that is deprecated for quite some time, but still used in some projects.

      Regarding containerized environments — I try to maintain some decent backward compatibility, so in general it should be safe to use the latest pyroute2 version (be it a tagged release or a master branch, since it passes the CI) on any kernel in common use. I still hesitate to remove some kernel 2.x legacy from the code

      Thank you for keeping an eye!

      — Additional comment from François Rigault on 2022-06-17 12:44:33 UTC —

      I verified it works well with 0.6.12 and risked a pr there
      https://src.fedoraproject.org/rpms/python-pyroute2/pull-request/13

      — Additional comment from Fedora Update System on 2022-06-19 23:24:47 UTC —

      FEDORA-2022-86b08d5138 has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-86b08d5138

      — Additional comment from Fedora Update System on 2022-06-19 23:24:48 UTC —

      FEDORA-2022-2bae1c6517 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-2bae1c6517

      — Additional comment from Fedora Update System on 2022-06-20 04:28:29 UTC —

      FEDORA-2022-86b08d5138 has been pushed to the Fedora 35 testing repository.
      Soon you'll be able to install the update with the following command:
      `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-86b08d5138`
      You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-86b08d5138

      See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

      — Additional comment from Fedora Update System on 2022-06-20 04:38:29 UTC —

      FEDORA-2022-2bae1c6517 has been pushed to the Fedora 36 testing repository.
      Soon you'll be able to install the update with the following command:
      `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-2bae1c6517`
      You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-2bae1c6517

      See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

      — Additional comment from Fedora Update System on 2022-06-28 01:32:46 UTC —

      FEDORA-2022-2bae1c6517 has been pushed to the Fedora 36 stable repository.
      If problem still persists, please make note of it in this bug report.

      — Additional comment from Fedora Update System on 2022-06-28 01:59:37 UTC —

      FEDORA-2022-86b08d5138 has been pushed to the Fedora 35 stable repository.
      If problem still persists, please make note of it in this bug report.

              Unassigned Unassigned
              jira-bugzilla-migration RH Bugzilla Integration
              rhos-dfg-networking-squad-neutron
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: