Loading...

XML

Word

Printable

Type: Bug
Resolution: Not a Bug
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.9
Component/s: Pod Autoscaler
Labels:
- migrated_from_bz

Activity Type:
Quality / Stability / Reliability
Blocked:
None
Blocked Reason:
None
Story Points:
3
Severity:
Low
Regression:
None
Architecture:

Unspecified

Target Backport Versions:
None
Target Version:

4.13
Release Blocker:
None
Sprint:
OCPNODE Sprint 234 (Green)
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
None
Release Note Type:
If docs needed, set a value
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

After OCS installation, there are multiple Events of Warning type from
horizontalpodautoscaler/noobaa-endpoint complaining that openshift is
"unable to get metrics for resource cpu". The stream of such events stops about
15 minutes after OCS installation.

NooBaa endpoint deployment is controlled by a horizontal pod autoscaler, which is the originator of these events

this is the cause for https://bugzilla.redhat.com/show_bug.cgi?id=1885524

Version-Release number of selected component (if applicable):

OCP 4.9.0-0.nightly-2021-11-24-090558
OCS 4.9.0-249.ci

How reproducible:

Steps to Reproduce:
1.
2.
3.

Steps to Reproduce
==================

1. Install OCP/OCS cluster
2. Login to OCP Console and open Overview Cluster dashboard
(Home -> Overview -> Cluster)
3. See "Recent events" list

Or you can also go to Events page or list events via command line client:
`oc get events -n openshift-storage`.

Actual results
==============

After OCS installation, I see warnings related to HPA noobaa-endpoint such as:

```
15m Warning FailedGetResourceMetric horizontalpodautoscaler/noobaa-endpoint unable to get metrics for resource cpu: no metrics returned from resource metrics API
15m Warning FailedComputeMetricsReplicas horizontalpodautoscaler/noobaa-endpoint invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics re
turned from resource metrics API
12m Warning FailedGetResourceMetric horizontalpodautoscaler/noobaa-endpoint did not receive metrics for any ready pods
12m Warning FailedComputeMetricsReplicas horizontalpodautoscaler/noobaa-endpoint invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
```

Expected results
================

Admin should not wait another 15 minutes after OCS Storage Cluster installation
for these events to stop.

There should be no such Warning events from
horizontalpodautoscaler/noobaa-endpoint right after OCS installation.

Additional info
===============

After about 15 minutes after OCS installation, the horizontalpodautoscaler
noobaa-endpoint seems to work fine (I don't claim it works as expected, rather
that it's not in an error state):

```

$ ./oc describe horizontalpodautoscaler/noobaa-endpoint -n openshift-storage
Name: noobaa-endpoint
Namespace: openshift-storage
Labels: app=noobaa
Annotations: <none>
CreationTimestamp: Mon, 05 Oct 2020 19:58:22 +0200
Reference: Deployment/noobaa-endpoint
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 0% (2m) / 80%
Min replicas: 1
Max replicas: 2
Deployment pods: 1 current / 1 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ScaleDownStabilized recent recommendations were higher than current one, applying the highest recent recommendation
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 19m (x2 over 20m) horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from resource metrics API
Warning FailedComputeMetricsReplicas 19m (x2 over 20m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Warning FailedComputeMetricsReplicas 17m (x10 over 19m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
Warning FailedGetResourceMetric 17m (x11 over 19m) horizontal-pod-autoscaler did not receive metrics for any ready pods
```

Private
Extra private groups
Comment 1

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

endpoints
2023/04/06 9:20 PM
2 kB
Jose Valdes
events-storage.txt
2023/04/06 9:20 PM
174 kB
Jose Valdes
Screenshot from 2023-04-05 15-48-18.png
2023/04/06 2:40 PM
173 kB
Jose Valdes
Screenshot from 2023-04-06 13-41-30.png
2023/04/06 9:24 PM
241 kB
Jose Valdes

Assignee:: Jose Valdes

Reporter:: Danny Zaken

QA Contact:: Weinan Liu

Contributing Groups:: Red Hat Employee

Need Info From:: Joel Smith

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2021/12/05 11:14 AM

Updated:: 2025/07/27 5:46 PM

Resolved:: 2023/04/12 6:29 PM

Details

Description

Attachments

Attachments

Easy Agile Planning Poker

Activity

People

Dates