Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Critical
Fix Version/s: None
Affects Version/s: 4.13.z
Component/s: Pod Autoscaler
Labels:
- triaged

Severity:
Critical
Regression:
No
Story Points:
3
Sprint:
PODAUTO - Sprint 244
sprint_count:
1
Release Blocker:
Proposed
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:

Hide
Cause: What actions or circumstances cause this bug to present.
Custom Metrics Autoscaler version 3.11.2-311 was released without a required volumeMount in the operator deployment. This caused the Custom Metrics Autoscaler operator pod to restart every 15 minutes. This version adds the required volumentMount to the operator deployment. The operator no longer restarts every 15 minutes.

Show
Cause: What actions or circumstances cause this bug to present. Custom Metrics Autoscaler version 3.11.2-311 was released without a required volumeMount in the operator deployment. This caused the Custom Metrics Autoscaler operator pod to restart every 15 minutes. This version adds the required volumentMount to the operator deployment. The operator no longer restarts every 15 minutes.
Release Note Type:
Bug Fix
Release Note Status:
Proposed

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:

Description of problem:

custom-metrics-autoscaler-operator pod is restarting every 15 minutes:
2023-10-24T09:41:27Z    ERROR   cert-rotation   max retries for checking certs existence        {"error": "timed out waiting for the condition"}
github.com/open-policy-agent/cert-controller/pkg/rotator.(*CertRotator).ensureCertsMounted
        /remote-source/keda-operator/app/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:853
2023-10-24T09:41:27Z    INFO    cert-rotation   stopping cert rotator controller
2023-10-24T09:41:27Z    INFO    Stopping and waiting for non leader election runnables
2023-10-24T09:41:27Z    INFO    Stopping and waiting for leader election runnables
2023-10-24T09:41:27Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "kedacontroller", "controllerGroup": "keda.sh", "controllerKind": "KedaController"}
2023-10-24T09:41:27Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "secret", "controllerGroup": "", "controllerKind": "Secret"}
2023-10-24T09:41:27Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "configmap", "controllerGroup": "", "controllerKind": "ConfigMap"}
2023-10-24T09:41:27Z    INFO    All workers finished    {"controller": "secret", "controllerGroup": "", "controllerKind": "Secret"}
2023-10-24T09:41:27Z    INFO    All workers finished    {"controller": "configmap", "controllerGroup": "", "controllerKind": "ConfigMap"}
2023-10-24T09:41:27Z    INFO    All workers finished    {"controller": "kedacontroller", "controllerGroup": "keda.sh", "controllerKind": "KedaController"}
2023-10-24T09:41:27Z    INFO    Shutdown signal received, waiting for all workers to finish     {"controller": "cert-rotator"}
2023-10-24T09:41:27Z    INFO    All workers finished    {"controller": "cert-rotator"}
2023-10-24T09:41:27Z    INFO    Stopping and waiting for caches
2023-10-24T09:41:27Z    INFO    Stopping and waiting for webhooks
2023-10-24T09:41:27Z    INFO    Stopping and waiting for HTTP servers
2023-10-24T09:41:27Z    INFO    controller-runtime.metrics      Shutting down metrics server with timeout of 1 minute
2023-10-24T09:41:27Z    INFO    shutting down server    {"kind": "health probe", "addr": "[::]:8081"}
2023-10-24T09:41:27Z    INFO    Wait completed, proceeding to shutdown the manager
2023-10-24T09:41:27Z    ERROR   setup   problem running manager {"error": "could not mount certs", "errorVerbose": "could not mount certs\ngithub.com/open-policy-agent/cert-controller/pkg/rotator.(*CertRotator).Start\n\t/remote-source/keda-operator/app/vendor/github.com/open-policy-agent/cert-controller/pkg/rotator/rotator.go:286\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/remote-source/keda-operator/app/vendor/sigs.k8s.io/controller-runtime/pkg/manager/runnable_group.go:223\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1598"}
main.main
        /remote-source/keda-operator/app/main.go:144
runtime.main
        /usr/lib/golang/src/runtime/proc.go:250
2023-10-24T09:41:27Z    ERROR   error received after stop sequence was engaged  {"error": "leader election lost"}
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1
        /remote-source/keda-operator/app/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:490

Version-Release number of selected component (if applicable):

2.11.2-311

How reproducible:

100%

Steps to Reproduce:

1. Install Custom Metrics Autoscaler
2. Create a kedacontroller object
3. Wait for 30 minutes, and you should see 2 restarts of the CMA operator

Actual results:

3 or 4 restarts within an hour

Expected results:

0 or 1 restart within an hour

Additional info:

links to

RHSA-2023:122898 Custom Metrics Autoscaler Operator for Red Hat 2.11.2-313 OpenShift Bug Fixes

Assignee:: Joel Smith

Reporter:: Raul Fernandez

QA Contact:: Weinan Liu

Contributors:: Raul Fernandez

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2023/10/24 11:29 PM

Updated:: 2023/10/30 1:05 AM

Resolved:: 2023/10/30 1:05 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates