Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.17.z
Component/s: Cluster Autoscaler
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

A CrashLoopBackOff in pod 'cluster-autoscaler-default' belonging to an OCP Konflux cluster.

Logs show:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x1dbd7d2]goroutine 217 [running]:
k8s.io/autoscaler/cluster-autoscaler/processors/nodegroupconfig.(*DelegatingNodeGroupConfigProcessor).GetIgnoreDaemonSetsUtilization(0xc000be2360, {0x0?, 0x0?})
    /go/src/k8s.io/autoscaler/cluster-autoscaler/processors/nodegroupconfig/node_group_config_processor.go:113 +0x52
k8s.io/autoscaler/cluster-autoscaler/core/scaledown/actuation.(*Actuator).scaleDownNodeToReport(0xc000634ea0, 0xc00b73e008, 0x0)
    /go/src/k8s.io/autoscaler/cluster-autoscaler/core/scaledown/actuation/actuator.go:313 +0xd8
k8s.io/autoscaler/cluster-autoscaler/core/scaledown/actuation.(*Actuator).deleteAsyncEmpty(0xc000634ea0, {0xc012d4cce0, 0x3, 0xc01d2d8100?})
    /go/src/k8s.io/autoscaler/cluster-autoscaler/core/scaledown/actuation/actuator.go:150 +0x456
k8s.io/autoscaler/cluster-autoscaler/core/scaledown/actuation.(*Actuator).StartDeletion(0xc000634ea0, {0xc01d2d8100, 0xd, 0x10}, {0x3e7b360, 0x0, 0x0})
    /go/src/k8s.io/autoscaler/cluster-autoscaler/core/scaledown/actuation/actuator.go:123 +0x234
k8s.io/autoscaler/cluster-autoscaler/core.(*StaticAutoscaler).RunOnce(0xc0003a80f0, {0x4?, 0xc00055b788?, 0x3e18040?})
    /go/src/k8s.io/autoscaler/cluster-autoscaler/core/static_autoscaler.go:658 +0x32fb
k8s.io/autoscaler/cluster-autoscaler/loop.RunAutoscalerOnce({0x7f06543eb7a0, 0xc0003a80f0}, 0xc000b0f0e0, {0xc257da789066f768?, 0xe109dd7a2c7c?, 0x3e18040?})
    /go/src/k8s.io/autoscaler/cluster-autoscaler/loop/run.go:36 +0x8a
main.run(0xc000b0f0e0, {0x28ecb10, 0xc000521950})
    /go/src/k8s.io/autoscaler/cluster-autoscaler/main.go:620 +0x4c6
main.main.func2({0x0?, 0x0?})
    /go/src/k8s.io/autoscaler/cluster-autoscaler/main.go:712 +0x1f
created by k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run in goroutine 1
    /go/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:213 +0xe6

As well as multiple:
status error: GetPodsToMove for "ip-xxx.ec2.internal" returned error: ocp-art-tenant/ose-5-0-ose-insights-runtime-extractor-ppt2d-build-images-2-pod is not replicated

And:
1 clusterstate.go:492] Failed to find readiness information for MachineSet/openshift-machine-api/kflux-ocp-p01-bk796-tenants-us-east-1b

Version-Release number of selected component (if applicable):

1.30.1

How reproducible:

Found seven times in same cluster.

Steps to Reproduce:

Bug was found using Konflux Perf & Scale's oomkill-and-crashloopbackoff-detector tool. For more information see https://github.com/redhat-appstudio/perfscale/tree/main/tools/oomkill-and-crashloopbackoff-detector.

For more information, see the linked Konflux bug.

Additional info:

The tool detected 7 occasions of this bug, but could properly collect logs for only two of those. I've attached the tgz from the original bug where log files can be found

Assignee:: Michael McCune

Reporter:: Meirav Rath

QA Contact:: Paul Rozehnal

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2026/02/25 9:14 PM

Updated:: 2026/02/27 1:51 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates