-
Bug
-
Resolution: Duplicate
-
Normal
-
None
-
4.16.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
Rejected
-
Quagsire Sprint 277
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Hi Team , The The customer is experiencing sudden spikes in CPU usage on Master1 node. ARO SRE team observed the increased API server load may be from 4,791,737 failing requests/day (~55 per second) from cronjobs/job-controller attempting to create pods in the "et-conference-dev" namespace which fails due to exceeded quota: ("pods "et-abs-cron-job-29243370-z9275" is forbidden: exceeded quota: et-conference-dev-rq, requested: limits.cpu=750m, used: limits.cpu=3850m, limited: limits.cpu=4") Also, observed load is coming from datadog running on the master nodes. The ARO SRE team observed that the registry-server in redhat-operators is highly active, and the logs are not providing useful insights. Additionally, the apiserver and openshift-marketplace are consuming significant CPU resources. Those CPU Spikes seems to be caused by redhat-operators on a similar known issues: https://issues.redhat.com/browse/OCPBUGS-48696 and https://issues.redhat.com/browse/OCPBUGS-58070
Thanks.
Version-Release number of selected component (if applicable):
ARo: 4.16.38
How reproducible:
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
- duplicates
-
OCPBUGS-48697 OLMv0: excessive catalog source snapshots cause severe performance regression [openshift-4.15.z]
-
- Closed
-
-
OCPBUGS-58070 High latency etcd disk writes due to openshift-marketplace pods/OLM
-
- Closed
-
-
OCPBUGS-61920 Operators are unable to Install/Update due to RPC DeadlineExceeded while listing bundles error.
-
- Closed
-