Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-10157

[1926986] nmstate interprets interface names as float64 and subsequently crashes on state update

XMLWordPrintable

    • No

      Description of problem:

      This happens on OCP 4.5.16 with CNV 2.4.2 – the customer has a support agreement / exception for that specific version.

      See below for reproducer steps.

      Version-Release number of selected component (if applicable):

      How reproducible:

      Steps to Reproduce:
      1.
      2.
      3.

      Actual results:

      Expected results:

      Additional info:

      reproducer:

      =============================================

      [akaris@linux cnv]$ oc get clusterversion
      NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
      version 4.5.16 True False 12m Cluster version is 4.5.16

      [akaris@linux cnv]$ cat cnv.yaml
      apiVersion: v1
      kind: Namespace
      metadata:
      name: openshift-cnv

      apiVersion: operators.coreos.com/v1
      kind: OperatorGroup
      metadata:
      name: kubevirt-hyperconverged-group
      namespace: openshift-cnv
      spec:
      targetNamespaces:

      • openshift-cnv

        apiVersion: operators.coreos.com/v1alpha1
        kind: Subscription
        metadata:
        name: hco-operatorhub
        namespace: openshift-cnv
        spec:
        source: redhat-operators
        sourceNamespace: openshift-marketplace
        name: kubevirt-hyperconverged
        startingCSV: kubevirt-hyperconverged-operator.v2.4.2
        channel: "2.4"
        installPlanApproval: Manual
        [akaris@linux cnv]$ oc apply -f cnv.yaml

      Edit and approve the 2.4.2 installplan:

      [akaris@linux cnv]$ oc get installplan
      NAME CSV APPROVAL APPROVED
      install-8nl6r kubevirt-hyperconverged-operator.v2.4.2 Manual true
      install-zbqpc kubevirt-hyperconverged-operator.v2.4.3 Manual false

      [akaris@linux cnv]$ cat hyper.yaml
      apiVersion: hco.kubevirt.io/v1alpha1
      kind: HyperConverged
      metadata:
      name: kubevirt-hyperconverged
      namespace: openshift-cnv
      spec:
      BareMetalPlatform: false

      oc apply -f hyper.yaml

      =============================================

      Once that's done:

      I can reproduce the nmstate pod issue.

      The trick lies in using an interface name that can be interpreted as a float64, but that will not be automatically quoted by the kubernetes parser (at least that's my interpretation).

      Preparation - install iproute inside a toolbox and create virtual interfaces on a worker node:
      ~~~
      [akaris@linux cnv]$ oc debug node/ip-10-0-154-20.eu-west-1.compute.internal
      Starting pod/ip-10-0-154-20eu-west-1computeinternal-debug ...
      To use host binaries, run `chroot /host`
      chroot /host
      toolbox
      Pod IP: 10.0.154.20
      If you don't see a command prompt, try pressing enter.
      chroot /host
      sh-4.4# toolbox
      Trying to pull registry.redhat.io/rhel8/support-tools...
      Getting image source signatures
      Copying blob cca21acb641a done
      Copying blob 5ee83610639d done
      Copying blob d9e72d058dc5 done
      Copying config be1f7079a9 done
      Writing manifest to image destination
      Storing signatures
      be1f7079a938a4ab5c1f8b4c7d2dc82b8c60598bb1e248438ced576829f96389
      Spawning a container 'toolbox-' with image 'registry.redhat.io/rhel8/support-tools'
      Detected RUN label in the container image. Using that as the default...
      command: podman run it --name toolbox -privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=toolbox -e IMAGE=registry.redhat.io/rhel8/support-tools:latest -v /run:/run -v /var/log:/var/log -v /etc/machine-id:/etc/machine-id -v /etc/localtime:/etc/localtime -v /:/host registry.redhat.io/rhel8/support-tools:latest
      [root@ip-10-0-154-20 /]#
      [root@ip-10-0-154-20 /]#
      [root@ip-10-0-154-20 /]#
      [root@ip-10-0-154-20 /]#
      [root@ip-10-0-154-20 /]# ip link
      bash: ip: command not found
      [root@ip-10-0-154-20 /]# yum install iproute -y
      Updating Subscription Management repositories.
      Unable to read consumer identity
      Subscription Manager is operating in container mode.

      This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.

      Red Hat Universal Base Image 8 (RPMs) - BaseOS 929 kB/s | 772 kB 00:00
      Red Hat Universal Base Image 8 (RPMs) - AppStream 16 MB/s | 4.9 MB 00:00
      Red Hat Universal Base Image 8 (RPMs) - CodeReady Builder 87 kB/s | 13 kB 00:00
      Dependencies resolved.
      ===============================================================================================================================================================================================================================================
      Package Architecture Version Repository Size
      ===============================================================================================================================================================================================================================================
      Installing:
      iproute x86_64 5.3.0-5.el8 ubi-8-baseos 665 k
      Installing dependencies:
      libmnl x86_64 1.0.4-6.el8 ubi-8-baseos 30 k

      Transaction Summary
      ===============================================================================================================================================================================================================================================
      Install 2 Packages

      Total download size: 696 k
      Installed size: 1.9 M
      Downloading Packages:
      (1/2): libmnl-1.0.4-6.el8.x86_64.rpm 364 kB/s | 30 kB 00:00
      (2/2): iproute-5.3.0-5.el8.x86_64.rpm 5.3 MB/s | 665 kB 00:00
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      Total 5.4 MB/s | 696 kB 00:00
      Running transaction check
      Transaction check succeeded.
      Running transaction test
      Transaction test succeeded.
      Running transaction
      Preparing : 1/1
      Installing : libmnl-1.0.4-6.el8.x86_64 1/2
      Running scriptlet: libmnl-1.0.4-6.el8.x86_64 1/2
      Installing : iproute-5.3.0-5.el8.x86_64 2/2
      Running scriptlet: iproute-5.3.0-5.el8.x86_64 2/2
      Verifying : libmnl-1.0.4-6.el8.x86_64 1/2
      Verifying : iproute-5.3.0-5.el8.x86_64 2/2
      Installed products updated.

      Installed:
      iproute-5.3.0-5.el8.x86_64 libmnl-1.0.4-6.el8.x86_64

      Complete!
      ~~~

      For example, 999999 or 11111.1 will be quoted correctly and will not reproduce the issue:
      ~~~
      [root@ip-10-0-154-20 /]# ip link add 9999999999 type dummy
      [root@ip-10-0-154-20 /]# ip link ls dev 9999999999
      62: 9999999999: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/ether d6:c5:3a:a6:ca:c7 brd ff:ff:ff:ff:ff:ff
      [root@ip-10-0-154-20 /]# ip link ls dev 187e15e9860b329
      41: 187e15e9860b329@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8901 qdisc noqueue master ovs-system state UP mode DEFAULT group default
      link/ether 32:ae:0d:64:5f:48 brd ff:ff:ff:ff:ff:ff link-netns 06e2e636-db29-4869-9b16-a0820be3a1a4
      [root@ip-10-0-154-20 /]# ip link add 1111111111111.1 type veth peer 2222222222222.2
      ~~~

      Delete the nmstate pod to repopulate the nns CRD:
      ~~~
      oc delete pod nmstate-handler-xs692
      ~~~

      These will be correctly quoted and the nns will be created - the pod will not crash:
      ~~~
      [akaris@linux cnv]$ oc get nns ip-10-0-154-20.eu-west-1.compute.internal -o yaml | grep name
      f:name: {}
      name: ip-10-0-154-20.eu-west-1.compute.internal
      name: ip-10-0-154-20.eu-west-1.compute.internal
      name: "1111111111111.1"
      name: 187e15e9860b329
      name: 193c943317dbfc9
      name: "2222222222222.2"
      name: 269419b1e64320a
      name: 27b6490791282c7
      name: 2914b714aa5ebbd
      name: 32178eafa95aeea
      name: 489ffd44abee7ea
      name: 50c9e0168038261
      name: 5bd487dc7312dc3
      name: 8db85e8da9bd00c
      name: 9647a3f85ae9cc3
      name: "9999999999"
      name: 9e09e0053f0c541
      name: br-int
      name: br-local
      name: c9eadb630efa062
      name: cfe3b6bd93f3fe3
      name: d1f43b7c0f0f61d
      name: de72896b9df785d
      name: e642867142e7d39
      name: ens3
      name: f6c91e659cddfe8
      name: genev_sys_6081
      name: lo
      name: ovn-k8s-gw0
      name: ovn-k8s-mp0
      ~~~

      However, 60e+02 is a float64 and will be parsed as such and reproduces the issue:
      ~~~
      [root@ip-10-0-154-20 /]# ip link add 60e+02 type dummy
      [root@ip-10-0-154-20 /]# ip link ls | grep 60
      5: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN mode DEFAULT group default qlen 1000
      link/ether 76:4c:b7:c2:54:62 brd ff:ff:ff:ff:ff:ff link-netns 686a7daa-d336-4003-b8c2-848ec063760d
      41: 187e15e9860b329@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8901 qdisc noqueue master ovs-system state UP mode DEFAULT group default
      60: 5bd487dc7312dc3@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8901 qdisc noqueue master ovs-system state UP mode DEFAULT group default
      65: 60e+02: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      [root@ip-10-0-154-20 /]#
      ~~~

      ~~~
      [akaris@linux cnv]$ oc get pods -o wide | grep ip-10-0-154-20 | grep nmstate
      nmstate-handler-xs692 1/1 Running 1 2m16s 10.0.154.20 ip-10-0-154-20.eu-west-1.compute.internal <none> <none>
      [akaris@linux cnv]$ oc delete pod nmstate-handler-xs692
      pod "nmstate-handler-xs692" deleted

      [akaris@linux cnv]$
      [akaris@linux cnv]$ oc get pods -o wide | grep ip-10-0-154-20 | grep nmstate
      nmstate-handler-gm6bw 1/1 Running 0 3s 10.0.154.20 ip-10-0-154-20.eu-west-1.compute.internal <none> <none>
      [akaris@linux cnv]$ oc get pods -o wide | grep ip-10-0-154-20 | grep nmstate
      nmstate-handler-gm6bw 0/1 Error 2 31s 10.0.154.20 ip-10-0-154-20.eu-west-1.compute.internal <none> <none>
      [akaris@linux cnv]$ oc logs nmstate-handler-gm6bw

      {"level":"info","ts":1612896873.3406088,"logger":"cmd","msg":"Operator Version: 0.21.0"} {"level":"info","ts":1612896873.345091,"logger":"cmd","msg":"Go Version: go1.13.15"} {"level":"info","ts":1612896873.3479779,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1612896873.3480678,"logger":"cmd","msg":"Version of operator-sdk: v0.15.1"} {"level":"info","ts":1612896873.3481107,"logger":"cmd","msg":"Try to take exclusive lock on file: /var/k8s_nmstate/handler_lock"} {"level":"info","ts":1612896873.3485055,"logger":"cmd","msg":"Successfully took nmstate exclusive lock"} {"level":"info","ts":1612896876.0043983,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"} {"level":"info","ts":1612896876.004686,"logger":"cmd","msg":"Registering Components."} {"level":"info","ts":1612896876.0049677,"logger":"cmd","msg":"Starting the Cmd."} {"level":"info","ts":1612896876.0053968,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nodenetworkconfigurationpolicy-controller","source":"kind source: /, Kind="} {"level":"info","ts":1612896876.0057232,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"nodenetworkconfigurationpolicy-controller"} {"level":"info","ts":1612896876.0070198,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"node-controller","source":"kind source: /, Kind="} {"level":"info","ts":1612896876.007319,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"node-controller"} {"level":"info","ts":1612896876.0058627,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"} {"level":"info","ts":1612896876.1059654,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"nodenetworkconfigurationpolicy-controller","worker count":1} {"level":"info","ts":1612896876.1091166,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"node-controller","worker count":1}

      E0209 18:54:36.725557 1 runtime.go:78] Observed a panic: &runtime.TypeAssertionError{_interface*runtime._type)(0x14ca120), concrete*runtime._type)(0x1460040), asserted*runtime._type)(0x1483a00), missingMethod:""} (interface conversion: interface {} is float64, not string)
      goroutine 331 [running]:
      k8s.io/apimachinery/pkg/util/runtime.logPanic(0x15050c0, 0xc0006e8300)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa3
      k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82
      panic(0x15050c0, 0xc0006e8300)
      /usr/lib/golang/src/runtime/panic.go:679 +0x1b2
      github.com/nmstate/kubernetes-nmstate/pkg/helper.filterOut(0xc0007a0800, 0x154b, 0x1800, 0x7f753b2d21d0, 0xc000445900, 0x1800, 0x0, 0x18ee540, 0xc00003e0a8, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/helper/client.go:211 +0x51a
      github.com/nmstate/kubernetes-nmstate/pkg/helper.UpdateCurrentState(0x1907400, 0xc000295f20, 0xc0004211e0, 0x0, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/helper/client.go:113 +0xc1
      github.com/nmstate/kubernetes-nmstate/pkg/helper.CreateOrUpdateNodeNetworkState(0x1907400, 0xc000295f20, 0xc0002bf800, 0x0, 0x0, 0xc000490420, 0x29, 0x18c2540, 0xc0002bf800)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/helper/client.go:103 +0x1d1
      github.com/nmstate/kubernetes-nmstate/pkg/controller/node.(*ReconcileNode).Reconcile(0xc0002b51c0, 0x0, 0x0, 0xc000490420, 0x29, 0xc00074fcd8, 0xc000440240, 0xc0004401b8, 0x18c6560)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/controller/node/node_controller.go:110 +0x322
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00076ed80, 0x154b560, 0xc0006c0700, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256 +0x162
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00076ed80, 0x1169000)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232 +0xcb
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc00076ed80)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211 +0x2b
      k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0006b6620)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e
      k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0006b6620, 0x3b9aca00, 0x0, 0xc0001af701, 0xc000604660)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
      k8s.io/apimachinery/pkg/util/wait.Until(0xc0006b6620, 0x3b9aca00, 0xc000604660)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
      created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:193 +0x328
      panic: interface conversion: interface {} is float64, not string [recovered]
      panic: interface conversion: interface {} is float64, not string

      goroutine 331 [running]:
      k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x105
      panic(0x15050c0, 0xc0006e8300)
      /usr/lib/golang/src/runtime/panic.go:679 +0x1b2
      github.com/nmstate/kubernetes-nmstate/pkg/helper.filterOut(0xc0007a0800, 0x154b, 0x1800, 0x7f753b2d21d0, 0xc000445900, 0x1800, 0x0, 0x18ee540, 0xc00003e0a8, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/helper/client.go:211 +0x51a
      github.com/nmstate/kubernetes-nmstate/pkg/helper.UpdateCurrentState(0x1907400, 0xc000295f20, 0xc0004211e0, 0x0, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/helper/client.go:113 +0xc1
      github.com/nmstate/kubernetes-nmstate/pkg/helper.CreateOrUpdateNodeNetworkState(0x1907400, 0xc000295f20, 0xc0002bf800, 0x0, 0x0, 0xc000490420, 0x29, 0x18c2540, 0xc0002bf800)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/helper/client.go:103 +0x1d1
      github.com/nmstate/kubernetes-nmstate/pkg/controller/node.(*ReconcileNode).Reconcile(0xc0002b51c0, 0x0, 0x0, 0xc000490420, 0x29, 0xc00074fcd8, 0xc000440240, 0xc0004401b8, 0x18c6560)
      /go/src/github.com/nmstate/kubernetes-nmstate/pkg/controller/node/node_controller.go:110 +0x322
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00076ed80, 0x154b560, 0xc0006c0700, 0x0)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256 +0x162
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00076ed80, 0x1169000)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232 +0xcb
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc00076ed80)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211 +0x2b
      k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0006b6620)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x5e
      k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0006b6620, 0x3b9aca00, 0x0, 0xc0001af701, 0xc000604660)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
      k8s.io/apimachinery/pkg/util/wait.Until(0xc0006b6620, 0x3b9aca00, 0xc000604660)
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
      created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
      /go/src/github.com/nmstate/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:193 +0x328
      [akaris@linux cnv]$
      ~~~

              phoracek@redhat.com Petr Horacek
              akaris@redhat.com Andreas Karis
              RH Bugzilla Integration RH Bugzilla Integration
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: