Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-15365

Aodh is unable to query Prometheus properly when a metric with the same name exists in the service project

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • rhos-18.0.9
    • rhos-18.0.6
    • openstack-aodh
    • None
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • openstack-aodh-evaluator-container-18.0.9-1
    • rhos-conplat-observability
    • None
    • Moderate

      To Reproduce Steps to reproduce the behavior:

      This will work in similar way with any metric. An example with autoscaling and with the ceilometer_cpu metric

      1. Create an autoscaling stack, which scales based on the ceilometer_cpu. For example by following the autoscaling documentation https://docs.redhat.com/en/documentation/red_hat_openstack_services_on_openshift/18.0/html-single/autoscaling_for_instances/index#proc_providing-feedback-on-red-hat-documentation
      2. Create a server inside the "service" project
      3. The autoscaling alarms start to get "insufficient data" from now on

      Expected behavior

      • Autoscaling works no matter what metrics we have inside the service project

      Bug impact

      • "service" is a special project, which shouldn't be used by users, so this is a low impact

      Known workaround

      • Edit your controlplane CR and add the following under .spec.telemetry.template.autoscaling.aodh.customServiceConfig:
      [DEFAULT]
      prometheus_disable_rbac = True
      • Example of the related part of the controlplane CR
      ...
            template:
              autoscaling:
                aodh:
                  apiTimeout: 60
                  customServiceConfig: |
                    [DEFAULT]
                    prometheus_disable_rbac = True
                  databaseAccount: aodh 
      ...

      Additional context

      Some commands

      # Create a stack
      $ export OS_PROJECT_NAME=second
      $ openstack stack create -t /tmp/templates/autoscaling.yaml -e /tmp/templates/resources.yaml stack1
      
      # wait a little and then view the alarms - All looks good
      $ openstack alarm list
      +--------------------------------------+------------+------------------------------------+-------+----------+---------+
      | alarm_id                             | type       | name                               | state | severity | enabled |
      +--------------------------------------+------------+------------------------------------+-------+----------+---------+
      | b1a4f78a-4395-4b40-825c-1b4c800dac49 | prometheus | stack1-cpu_alarm_high-ydpclfw4xmwl | ok    | low      | True    |
      | 235d7eaf-82af-4c49-ad7e-a11f513b8460 | prometheus | stack1-cpu_alarm_low-sts42nx34gcc  | alarm | low      | True    |
      +--------------------------------------+------------+------------------------------------+-------+----------+---------+
      
      # go into the "service" project and create a server
      $ export OS_PROJECT_NAME=service
      $ openstack --os-compute-api-version 2.37 server create --flavor m1.small     --image cirros --nic none --wait test-server-service
      
      # wait a little, go back into the "second" project and view the alarms - they're bad
      $ export OS_PROJECT_NAME=second
      $ openstack alarm list
      +--------------------------------------+------------+------------------------------------+-------------------+----------+---------+
      | alarm_id                             | type       | name                               | state             | severity | enabled |
      +--------------------------------------+------------+------------------------------------+-------------------+----------+---------+
      | b1a4f78a-4395-4b40-825c-1b4c800dac49 | prometheus | stack1-cpu_alarm_high-ydpclfw4xmwl | insufficient data | low      | True    |
      | 235d7eaf-82af-4c49-ad7e-a11f513b8460 | prometheus | stack1-cpu_alarm_low-sts42nx34gcc  | insufficient data | low      | True    |
      +--------------------------------------+------------+------------------------------------+-------------------+----------+---------+

              rh-ee-jwysogla Jaromir Wysoglad
              rh-ee-jwysogla Jaromir Wysoglad
              Leonid Natapov Leonid Natapov
              rhos-conplat-observability
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: