Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-20623

Errors appear in decision-engine logs when storage collector and notifications are enabled on cinder operations

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • rhos-18.0.10 FR 3
    • openstack-watcher
    • None
    • 2
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • None
    • Moderate

      In a opnestack cluster where:

      • watcher and cinder are enabled
      • notifications are enabled
      • storage collector plugin is enabled

      When there is no running an audit which requires cinder information, I find following errors when a new volume is deleted i.e. (may happen in other cinder related events):

      event:

      2025-08-14 07:20:03.485 1 INFO watcher.decision_engine.model.notification.cinder [req-c8f6d188-6534-4070-884a-da20b3f87d9e b22ef503-0376-47b4-9f24-dc151ff82fbe 01457999
      b9b64fdea10e9bbf3f2d6df7 f94c9861051c48c8842da5be152b6bde - - - volume.delete.end] Event 'volume.delete.end' received from volume.cinder-volume-volume1-0@lvm with metadata {'message_id': 'b22ef503-0376-47b4-9f24-dc151ff82fbe', 'timestamp': '2025-08-14 07:20:03.479700'}
      2025-08-14 07:20:03.485 1 DEBUG watcher.decision_engine.model.notification.cinder [req-c8f6d188-6534-4070-884a-da20b3f87d9e b22ef503-0376-47b4-9f24-dc151ff82fbe 01457999b9b64fdea10e9bbf3f2d6df7 f94c9861051c48c8842da5be152b6bde - - - volume.delete.end] {'tenant_id': 'f94c9861051c48c8842da5be152b6bde', 'host': 'cinder-volume-volume1-0@lvm#cinder1', 'user_id': '01457999b9b64fdea10e9bbf3f2d6df7', 'availability_zone': 'nova', 'volume_id': 'bb31e21a-8b6d-466b-9f9c-b4e610fa12dd', 'volume_type': 'eb8e4429-7acd-49f2-96a7-008319ba84e5', 'display_name': 'test3', 'launched_at': '2025-08-14T07:13:16+00:00', 'created_at': '2025-08-14T07:13:15+00:00', 'status': 'deleted', 'snapshot_id': None, 'size': 1, 'replication_status': None, 'replication_extended_status': None, 'replication_driver_data': None, 'metadata': [], 'volume_attachment': []} info /usr/lib/python3.9/site-packages/watcher/decision_engine/model/notification/cinder.py:374

      Error:

      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher [req-c8f6d188-6534-4070-884a-da20b3f87d9e b22ef503-0376-47b4-9f24-dc151ff82fbe 01457999b9b64fdea10e9bbf3f2d6df7 f94c9861051c48c8842da5be152b6bde - - - volume.delete.end] Callback raised an exception.: AttributeError: 'NoneType' object has no attribute 'get_pool_by_pool_name'
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher Traceback (most recent call last):
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher File "/usr/lib/python3.9/site-packages/oslo_messaging/notify/dispatcher.py", line 78, in _exec_callback
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher return callback(message["ctxt"],
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher File "/usr/lib/python3.9/site-packages/watcher/decision_engine/model/notification/cinder.py", line 377, in info
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher volume = self.get_or_create_volume(volume_id, poolname)
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher File "/usr/lib/python3.9/site-packages/watcher/decision_engine/model/notification/cinder.py", line 139, in get_or_create_volume
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher self.get_or_create_pool(pool_name)
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher File "/usr/lib/python3.9/site-packages/watcher/decision_engine/model/notification/cinder.py", line 127, in get_or_create_pool
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher return self.cluster_data_model.get_pool_by_pool_name(name)
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher AttributeError: 'NoneType' object has no attribute 'get_pool_by_pool_name'
      2025-08-14 07:20:03.486 1 ERROR oslo_messaging.notify.dispatcher
      2025-08-14 07:20:03.486 1 DEBUG oslo_messaging._drivers.amqpdriver [-] AMQPListener connection timeout poll /usr/lib/python3.9/site-packages/oslo_messaging/_drivers/amqpdriver.py:431

      Root cause is that the storage volume is empty as the cinder collector has certain logic to skip creating the model when there are no running audits which requires it. See messages:

      2025-08-14 07:20:03.485 1 DEBUG watcher.decision_engine.model.collector.cinder [req-c8f6d188-6534-4070-884a-da20b3f87d9e b22ef503-0376-47b4-9f24-dc151ff82fbe 01457999b9b64fdea10e9bbf3f2d6df7 f94c9861051c48c8842da5be152b6bde - - - volume.delete.end] Building latest Cinder cluster data model execute /usr/lib/python3.9/site-packages/watcher/decision_engine/model/collector/cinder.py:148
      2025-08-14 07:20:03.485 1 DEBUG watcher.decision_engine.model.collector.cinder [req-c8f6d188-6534-4070-884a-da20b3f87d9e b22ef503-0376-47b4-9f24-dc151ff82fbe 01457999b9b64fdea10e9bbf3f2d6df7 f94c9861051c48c8842da5be152b6bde - - - volume.delete.end] No audit, Don't Build storage data model execute /usr/lib/python3.9/site-packages/watcher/decision_engine/model/collector/cinder.py:151

      I'd suggest to take advantadge of this feature and skip processing the messages in this situation to avoid unneeded processing.

      Note that, when a strategy which requires the storage model is actually executed the model will be initialized with the current state. Since that moment, the model is not longer empty and the messages are properly processed (the collector also updates the model periodically as expected based on the period parameter since then).

              amoralej1@redhat.com Alfredo Moralejo Alonso
              amoralej1@redhat.com Alfredo Moralejo Alonso
              rhos-workloads-evolution
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: