Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-21970

collectd container with "ipmi" device attached experienced ipmi plugin issue post modification of major ID of "/dev/ipmi0" device on host

XMLWordPrintable

    • Refinement
    • 1
    • Moderate

      To Reproduce Steps to reproduce the behavior:

      1. Compute node up with "collectd" container with "ipmi" device present [1]
      2. Major iD of "/dev/ipmi0" device on host compute node gets modified due to any reason [2]
      3. "collectd" container will be unable to bind again the device "/dev/ipmi0", hence IPMI plugin will be impacted [3]

       

      [1]

      ~$ grep -A 1 "--device" sos_commands/podman/containers/podman_inspect_9f3f11fa346f
                          "--device",
                          "/dev/ipmi0",~

       

      [2]

      ~$ grep ipmi sos_commands/block/ls_-lanR_.dev | tail -n1
      lrwxrwxrwx.  1 0 0    8 Sep  6 22:13 238:0 -> ../ipmi0~

       

      [3]

      $ head -n1 sos_commands/podman/containers/podman_inspect_9f3f11fa346f 
      time="2025-09-29T12:38:46+05:30" level=warning msg="Could not locate device 236:0 on host" 

       

      Error message: [2025-09-29 10:02:04] ipmi plugin: c_ipmi_read: I'm not active, returning false. [2025-09-29 10:02:04] read-function of plugin `ipmi/main' failed. Will suspend it for 86400.000 seconds.

       

      Expected behavior

      • "collectd" container should continue to bind the device "/dev/ipmi0" from host

       

      Bug impact

      • IPMI plugin within collectd conatiners stops working impacting production

       

      Known workaround

      • Recreation of "collectd" container binds back the ""/dev/ipmi0" device

       

      Additional context

      • major:minor number of "/dev/ipmi0" device is dynamically assigned and can get changed
      • any running container that had "/dev/ipmi0" bind-mounted will continue referencing the old inode, which no longer exists post modification. Hence the container will not see the new "/dev/ipmi0" until it is recreated (not just restarted)
      • instead of "- - device /dev/ipmi0", creating the "collectd" container with volume mount "/dev" along with "--privileged" flag will be able to prevent the issue

              Unassigned Unassigned
              rhn-support-apverma Apoorv Verma
              rhos-conplat-observability
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: