Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-44313

SriovNetworkNodeState operand does not report NIC info on "CentOS Stream CoreOS" systems

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • 4.16.z
    • Networking / SR-IOV
    • None
    • None
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      When deploying the SR-IOV Network Operator in a OpenShift-, OKD-based cloud that uses "CentOS Stream CoreOS" as underlying OS, the initContainer in the sriov-network-config-daemon fails to start, and hence the SriovNetworkNodeState operand does not report NIC info.

      # openshift version
      -> oc get no -owide
      NAME                          STATUS   ROLES                         AGE   VERSION   INTERNAL-IP         EXTERNAL-IP   OS-IMAGE                                    KERNEL-VERSION          CONTAINER-RUNTIME
      zt-sno3.inbound.vz.bos2.lab   Ready    control-plane,master,worker   22h   v1.30.5   2600:52:7:59::300   <none>        CentOS Stream CoreOS 417.9.202410282133-0   5.14.0-522.el9.x86_64   cri-o://1.30.6

      SriovNetworkNodeState operand spec

      -> oc get SriovNetworkNodeState -n openshift-sriov-network-operator zt-sno3.inbound.vz.bos2.lab -oyaml
      apiVersion: sriovnetwork.openshift.io/v1
      kind: SriovNetworkNodeState
      metadata:
        creationTimestamp: "2024-11-06T11:36:54Z"
        generation: 1
        name: zt-sno3.inbound.vz.bos2.lab
        namespace: openshift-sriov-network-operator
        ownerReferences:
        - apiVersion: sriovnetwork.openshift.io/v1
          blockOwnerDeletion: true
          controller: true
          kind: SriovOperatorConfig
          name: default
          uid: a059ca39-9fb2-4e19-9e89-c212578c3d41
        resourceVersion: "21225"
        uid: a995325a-fb70-48de-943f-6762d3f60561
      spec: {} 

      Operator deployment

      -> oc get pod
      NAME                                      READY   STATUS                  RESTARTS         AGE
      network-resources-injector-f5gsp          1/1     Running                 1                22h
      operator-webhook-sm44k                    1/1     Running                 1                22h
      sriov-network-config-daemon-6v57x         0/1     Init:CrashLoopBackOff   14 (4m43s ago)   51m
      sriov-network-operator-7c5b8d967f-p7vbp   1/1     Running                 0                68m 

      Root cause of the problem

      -> oc logs sriov-network-config-daemon-6v57x --previous -c sriov-cni
      + CNI_BIN_DIR=/host/opt/cni/bin
      ++ get_source_folder_for_rhel_version
      ++ '[' '!' -f /host/etc/os-release ']'
      ++ . /host/etc/os-release
      +++ NAME='CentOS Stream CoreOS'
      +++ ID=scos
      +++ ID_LIKE='rhel fedora'
      +++ VERSION=417.9.202410282133-0
      +++ VERSION_ID=4.17
      +++ VARIANT=CoreOS
      +++ VARIANT_ID=coreos
      +++ PLATFORM_ID=platform:el9
      +++ PRETTY_NAME='CentOS Stream CoreOS 417.9.202410282133-0'
      +++ ANSI_COLOR='0;31'
      +++ CPE_NAME=cpe:/o:centos:centos:9::coreos
      +++ HOME_URL=https://centos.org/
      +++ DOCUMENTATION_URL=https://docs.okd.io/latest/welcome/index.html
      +++ BUG_REPORT_URL=https://access.redhat.com/labs/rhir/
      +++ REDHAT_BUGZILLA_PRODUCT='OpenShift Container Platform'
      +++ REDHAT_BUGZILLA_PRODUCT_VERSION=4.17
      +++ REDHAT_SUPPORT_PRODUCT='OpenShift Container Platform'
      +++ REDHAT_SUPPORT_PRODUCT_VERSION=4.17
      +++ OPENSHIFT_VERSION=4.17
      +++ OSTREE_VERSION=417.9.202410282133-0
      ++ rhelmajor=
      ++ case "${ID}" in
      +++ echo ''
      +++ sed -E 's/([0-9]+)\.{1}[0-9]+(\.[0-9]+)?/\1/'
      ++ rhelmajor=
      ++ sourcedir=/usr/bin
      ++ case "${rhelmajor}" in
      ++ log 'ERROR: RHEL Major Version Unsupported, rhelmajor='
      +++ date --iso-8601=seconds
      ++ echo '2024-11-07T09:41:48+00:00 ERROR: RHEL Major Version Unsupported, rhelmajor='
      ++ echo /usr/bin
      + SRIOV_BIN_FILE='2024-11-07T09:41:48+00:00 ERROR: RHEL Major Version Unsupported, rhelmajor=
      /usr/bin/sriov'
      + NO_SLEEP=0
      + '[' --no-sleep '!=' '' ']'
      ++ echo --no-sleep
      ++ awk -F= '{print $1}'
      + PARAM=--no-sleep
      ++ echo --no-sleep
      ++ awk -F= '{print $2}'
      + VALUE=
      + case $PARAM in
      + NO_SLEEP=1
      + shift
      + '[' '' '!=' '' ']'
      + for i in $CNI_BIN_DIR $SRIOV_BIN_FILE
      + '[' '!' -e /host/opt/cni/bin ']'
      + for i in $CNI_BIN_DIR $SRIOV_BIN_FILE
      + '[' '!' -e 2024-11-07T09:41:48+00:00 ']'
      + /bin/echo 'Location 2024-11-07T09:41:48+00:00 does not exist'
      Location 2024-11-07T09:41:48+00:00 does not exist
      + exit 1 

      Version-Release number of selected component (if applicable):

      -> oc get csv sriov-network-operator.v4.16.0-202407100107 
      NAME                                          DISPLAY                   VERSION               REPLACES   PHASE
      sriov-network-operator.v4.16.0-202407100107   SR-IOV Network Operator   4.16.0-202407100107              Succeeded
      

      How reproducible:

      Always.

      Steps to Reproduce:

      Deploy OpenShift or OKD over a "CentOS Stream CoreOS" system.

      Actual results:

      sriov-network-config-daemon fails to start

      -> oc get pod
      NAME                                      READY   STATUS                  RESTARTS         AGE
      network-resources-injector-f5gsp          1/1     Running                 1                22h
      operator-webhook-sm44k                    1/1     Running                 1                22h
      sriov-network-config-daemon-6v57x         0/1     Init:CrashLoopBackOff   14 (2m40s ago)   49m
      sriov-network-operator-7c5b8d967f-p7vbp   1/1     Running                 0                66m
      

      Expected results:

      sriov-network-config-daemon running and the SriovNetworkNodeState operand reporting NIV info.

      Additional info:

      For more information, check out the Slack discussion: https://redhat-internal.slack.com/archives/CQEK2R890/p1730899883227239

              thaller@redhat.com Thomas Haller
              lochoa@redhat.com Leo Ochoa
              Zhanqi Zhao Zhanqi Zhao
              Sebastian Scheinkman
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: