-
Bug
-
Resolution: Unresolved
-
Normal
-
rhos-18.0 Feature Release 1 (Nov 2024)
-
None
Autoscaling doesn't get properly configured to reach Prometheus with TLS when the autoscaling resource is created before the metricstorage resource.
Replication steps:
- Disable autoscaling and metric storage
- Enable autoscaling
- Wait for autoscaling to fully start
- Enable metric storage
- Try creating an alarm
There should be an error in the aodh-evaluator logs similar to the following:
2024-10-18 19:58:56.947 16 ERROR aodh.evaluator [-] Failed to evaluate alarm 89b8de0b-8ce3-40e4-8cb6-d676209ba5e2: observabilityclient.prometheus_client.PrometheusAPIClientError: [400] Client sent an HTTP request to an HTTPS server. 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator Traceback (most recent call last): 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/aodh/evaluator/__init__.py", line 297, in _evaluate_alarm 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator self.evaluators[alarm.type].obj.evaluate(alarm) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/aodh/evaluator/threshold.py", line 174, in evaluate 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator evaluation = self.evaluate_rule(alarm.rule) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/aodh/evaluator/prometheus.py", line 64, in evaluate_rule 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator metrics = self._get_metric_data(alarm_rule['query']) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/aodh/evaluator/prometheus.py", line 47, in _get_metric_data 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator return self._prom.query.query(query, disable_rbac=self._no_rbac) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/observabilityclient/v1/python_api.py", line 66, in query 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator query = self.client.rbac.enrich_query(query, disable_rbac=disable_rbac) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/observabilityclient/v1/rbac.py", line 103, in enrich_query 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator metric_names = self.client.query.list(disable_rbac=False) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/observabilityclient/v1/python_api.py", line 31, in list 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator metrics = self.prom.series(match) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/observabilityclient/prometheus_client.py", line 118, in series 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator decoded = self._get("series", {"match[]": matches}) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator File "/usr/lib/python3.9/site-packages/observabilityclient/prometheus_client.py", line 77, in _get 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator raise PrometheusAPIClientError(resp) 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator observabilityclient.prometheus_client.PrometheusAPIClientError: [400] Client sent an HTTP request to an HTTPS server. 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator 2024-10-18 19:58:56.947 16 ERROR aodh.evaluator 2024-10-18 19:58:56.951 16 DEBUG aodh.evaluator [-] Evaluating alarm 6ca7258f-b9c3-4351-9c71-f1690f11b549 _evaluate_alarm /usr/lib/python3.9/site-packages/aodh/evaluator/__init__.py:295
WA:
Delete the autoscaling resource with `oc delete autoscaling autoscaling`. It'll get automatically recreated with the correct config.
Proposed fix:
The telemetry-operator's autoscaling controller should watch the metricstorage resource.
- links to
- mentioned on