Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-3505

Setting [compute]cpu_dedicate_set causes nova-compute to fail to start due to power management

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • rhos-18.0 Dev Preview 3
    • rhos-18.0 Dev Preview 3
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • ?
    • ?
    • No
    • 2024Q1
    • Important
    • Compute

      2024-01-11 20:30:37.243 2 ERROR oslo_service.service [None req-2ee165a9-3ed1-474c-afe3-586a72fc78a5 - - - - - -] Error starting thread.: nova.exception.FileNotFound: File /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor could not be found.
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service Traceback (most recent call last):
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/filesystem.py", line 37, in read_sys
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     with open(os.path.join(SYS, path), mode='r') as data:
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service FileNotFoundError: [Errno 2] No such file or directory: '/sys/devices/system/cpu/cpu8/cpufreq/scaling_governor'
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service 
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service The above exception was the direct cause of the following exception:
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service 
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service Traceback (most recent call last):
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/oslo_service/service.py", line 806, in run_service
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     service.start()
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/service.py", line 162, in start
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     self.manager.init_host(self.service_ref)
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 1608, in init_host
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     self.driver.init_host(host=self.host)
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 825, in init_host
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     libvirt_cpu.validate_all_dedicated_cpus()
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/cpu/api.py", line 141, in validate_all_dedicated_cpus
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     governors.add(pcpu.governor)
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/cpu/api.py", line 63, in governor
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     return core.get_governor(self.ident)
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/cpu/core.py", line 69, in get_governor
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     return filesystem.read_sys(
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service   File "/usr/lib/python3.9/site-packages/nova/filesystem.py", line 40, in read_sys
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service     raise exception.FileNotFound(file_path=path) from exc
      2024-01-11 20:30:37.243 2 ERROR oslo_service.service nova.exception.FileNotFound: File /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor could not be found.
      

      We broke CPU pinning in dev-preview3 by enabling nova's power management feature with https://github.com/openstack-k8s-operators/nova-operator/pull/597 .

      https://github.com/openstack-k8s-operators/nova-operator/pull/597 is now reverted on main via https://github.com/openstack-k8s-operators/nova-operator/pull/650 so CPU pinning works on main again and power management is disabled on main.

      To actually solve the power management issue we need to land the nova upstream fixes to stable/2023.1 proposed in https://review.opendev.org/q/topic:%22power-mgmt-fixups%22 and then re-enable power management in nova-operator.

            rh-ee-bgibizer Balazs Gibizer
            rh-ee-bgibizer Balazs Gibizer
            rhos-dfg-compute
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: