Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14394

Operator doesn't recreate "supported-nic-ids" when deleted, causing config daemon to fail

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • 4.12.z
    • Networking / SR-IOV
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • No
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      
      When the ConfigMap "supported-nic-ids" is deleted (by accident for example), the operator doesn't recreate it, and the config map must be created manually.
      
      This is causing disruption of the sriov config daemon service as the configmap is required.
      
      ```
      2023-05-25T18:51:56.333250547Z E0525 18:51:56.333228 1616718 start.go:176] failed to run daemon: configmaps "supported-nic-ids" not found
      2023-05-25T18:51:56.333250547Z I0525 18:51:56.333243 1616718 start.go:178] Shutting down SriovNetworkConfigDaemon
      ```
      
      

      Version-Release number of selected component (if applicable):

      OpenShift 4.12.4
      
      

      How reproducible:

      - delete the CM and check if it is recreating
      - delete the config daemon pods to see if they will start
      

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      - configmap is missing
      

      Expected results:

      - configmap should be recreated
      

      Additional info:

      - just FYI, I was trying to reproduce it, I have lab 4.12.13 and the config daemon starts normally - just reporting that the CM is missing
      ~~~
      E0531 20:44:30.340440 3316015 start.go:170] failed to run init NicIdMap: configmaps "supported-nic-ids" not found
      I0531 20:44:30.340450 3316015 writer.go:47] RunOnce()
      I0531 20:44:30.340453 3316015 writer.go:71] RunOnce(): first poll for nic status
      ~~~
      - customer after recreating the CM confirms that the daemon starts normally
      

              bnemeth@redhat.com Balazs Nemeth
              rhn-support-vwalek Vladislav Walek
              None
              None
              Zhanqi Zhao Zhanqi Zhao
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: