Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-66405

VPA recommender crashlooping on "fatal error: concurrent map writes"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Critical Critical
    • None
    • 4.20
    • Pod Autoscaler
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • 3
    • None
    • None
    • None
    • None
    • None
    • AUTOSCALE - Sprint 281, AUTOSCALE - Sprint 282
    • 2
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Deployed a brand new OCP cluster and installed VPA, observed the vpa-recommender pod crahslooping, logs show a fatal error with many stack traces.
      
      NAME                                                READY   STATUS    RESTARTS          AGE
      vertical-pod-autoscaler-operator-76768f8dbf-lwswt   1/1     Running   0                 14h
      vpa-admission-plugin-default-b84b8f5cc-f5b75        1/1     Running   0                 14h
      vpa-recommender-default-d8f9954d8-ljzxr             1/1     Running   119 (4m38s ago)   14h
      vpa-updater-default-64c4cc9876-lmc85                1/1     Running   0                 14h
      
      

      Version-Release number of selected component (if applicable):

      OCP - 4.20.6
      VPA - 4.20.0-202511250912 

      How reproducible:

      Consistently reproducing with this environment

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

      I1204 13:31:34.581364       1 cluster.go:412] "Removing empty and not contributive AggregateCollectionState" key={}
      I1204 13:31:34.581378       1 cluster.go:412] "Removing empty and not contributive AggregateCollectionState" key={}
      I1204 13:31:34.581388       1 cluster.go:412] "Removing empty and not contributive AggregateCollectionState" key={}
      I1204 13:31:34.581405       1 cluster.go:412] "Removing empty and not contributive AggregateCollectionState" key={}
      fatal error: concurrent map writes
      
      goroutine 719 [running]:
      internal/runtime/maps.fatal({0x217e142?, 0xc00195ada0?})
      	/usr/lib/golang/src/runtime/panic.go:1058 +0x18
      k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/metrics/recommender.(*ObjectCounter).Add(0xc0005882b8, 0xc001f00e40)
      	/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/metrics/recommender/recommender.go:192 +0x2b2
      k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/routines.(*recommender).UpdateVPAs.func1()
      	/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/routines/recommender.go:144 +0xa8
      created by k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/routines.(*recommender).UpdateVPAs in goroutine 1
      	/go/src/k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/routines/recommender.go:131 +0x14b

              jkyros@redhat.com John Kyros
              akrzos@redhat.com Alex Krzos
              None
              None
              Paul Rozehnal Paul Rozehnal
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: