Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-7191

CLO panic: runtime error happens on cluster-logging-operator v5.9.3 cause cluster-logging operator CrashLoopBackOff

XMLWordPrintable

    • Incidents & Support
    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • NEW
    • Before this update, CLF spec validation can fail with nil pointer runtime error. With this update, additional checks on nil resolves the issue.
    • Bug Fix
    • Log Collection - Sprint 271, Log Collection - Sprint 273, Logging - Sprint 274, Logging - Sprint 275, Logging - Sprint 276
    • Moderate

      ==== ISSUE Found

      1) 

      CLO panic: runtime error happens on cluster-logging-operator v5.9.3 cause cluster-logging operator CrashLoopBackOff

      $ oc get csv -n openshift-logging
      NAME                                               DISPLAY                                          VERSION                 REPLACES                                   PHASE
      cluster-logging.v5.9.3                             Red Hat OpenShift Logging                        5.9.3                   cluster-logging.v5.8.8                     Installing
      
      $ omc get pods
      NAME                                        READY   STATUS             RESTARTS   AGE
      cluster-logging-operator-xxxxxxxxxxxxxxxx   0/1     CrashLoopBackOff   47         3h
      
      
      
      $ omc logs pods/cluster-logging-operator-xxxxxxxxxxxxxxxx
       
       2025-05-21T07:10:53.729192378Z {"_ts":"2025-05-21T07:10:53.729143905Z","_level":"0","_component":"cluster-logging-operator","_message":"Starting workers","controller":"clusterlogforwarder","controllerGroup":"logging.openshift.io","controllerKind":"ClusterLogForwarder","worker count":1}
      
      2025-05-21T07:10:53.730721603Z {"_ts":"2025-05-21T07:10:53.730672426Z","_level":"0","_component":"cluster-logging-operator","_message":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","ClusterLogging":{"name":"guest-logforwarder","namespace":"guest-logging"},"controller":"clusterlogging","controllerGroup":"logging.openshift.io","controllerKind":"ClusterLogging","name":"guest-logforwarder","namespace":"guest-logging","reconcileID":"c60bcd54-90d8-4bbf-89dd-e209e109381a"}
      
      2025-05-21T07:10:53.733919919Z panic: runtime error: invalid memory address or nil pointer dereference [recovered]
      
      2025-05-21T07:10:53.733919919Z     panic: runtime error: invalid memory address or nil pointer dereference
      
      2025-05-21T07:10:53.733919919Z [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1698b4f]
      2025-05-21T07:10:53.733919919Z 
      2025-05-21T07:10:53.733919919Z goroutine 428 [running]:
      2025-05-21T07:10:53.733919919Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
      2025-05-21T07:10:53.733919919Z     /remote-source/cluster-logging-operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/controller/controller.go:116 +0x1e5
      2025-05-21T07:10:53.733948653Z panic({0x188a380?, 0x2aca780?})
      2025-05-21T07:10:53.733960402Z     /usr/lib/golang/src/runtime/panic.go:914 +0x21f
      2025-05-21T07:10:53.733972593Z github.com/openshift/cluster-logging-operator/internal/validations/clusterlogforwarder.verifyHostNameNotFilteredForGCL({0xc02033c500?, 0x1, 0xc0006971b0?}, {0xc02033c520, 0x1, 0x0?}, 0x0?, 0x0?)
      2025-05-21T07:10:53.734000299Z     /remote-source/cluster-logging-operator/app/internal/validations/clusterlogforwarder/validate_clusterlogforwarderspec.go:471 +0x10f
      2025-05-21T07:10:53.734010100Z github.com/openshift/cluster-logging-operator/internal/validations/clusterlogforwarder.verifyPipelines({0xc000821008, 0x12}, 0xc02042a7a0, 0xc008bce100)
      2025-05-21T07:10:53.734019851Z     /remote-source/cluster-logging-operator/app/internal/validations/clusterlogforwarder/validate_clusterlogforwarderspec.go:159 +0x1513
      2025-05-21T07:10:53.734029165Z github.com/openshift/cluster-logging-operator/internal/validations/clusterlogforwarder.ValidateInputsOutputsPipelines({{{0x1739ffb, 0x13}, {0xc01f9fe1f8, 0x17}}, {{0xc000821008, 0x12}, {0x0, 0x0}, {0xc0006971a0, 0xd}, ...}, ...}, ...)
      2025-05-21T07:10:53.734073360Z     /remote-source/cluster-logging-operator/app/internal/validations/clusterlogforwarder/validate_clusterlogforwarderspec.go:48 +0x38f
      2025-05-21T07:10:53.734073360Z github.com/openshift/cluster-logging-operator/internal/validations/clusterlogforwarder.Validate({{{0x1739ffb, 0x13}, {0xc01f9fe1f8, 0x17}}, {{0xc000821008, 0x12}, {0x0, 0x0}, {0xc0006971a0, 0xd}, ...}, ...}, ...)
      2025-05-21T07:10:53.734118586Z     /remote-source/cluster-logging-operator/app/internal/validations/clusterlogforwarder/validations.go:11 +0xee
      2025-05-21T07:10:53.734135521Z github.com/openshift/cluster-logging-operator/internal/k8s/loader.FetchClusterLogForwarder({_, _}, {_, _}, {_, _}, _, _)
      2025-05-21T07:10:53.734144687Z     /remote-source/cluster-logging-operator/app/internal/k8s/loader/load.go:73 +0x8d0
      2025-05-21T07:10:53.734153636Z github.com/openshift/cluster-logging-operator/internal/controller/clusterlogging.(*ReconcileClusterLogging).Reconcile(0xc0006205a0, {0x1dc5998?, 0xc02025e9c0}, {{{0xc0012e9d30?, 0xd?}, {0xc0012c58f0?, 0x12?}}})
      2025-05-21T07:10:53.734179858Z     /remote-source/cluster-logging-operator/app/internal/controller/clusterlogging/clusterlogging_controller.go:107 +0x965
      2025-05-21T07:10:53.734188794Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1dc8d88?, {0x1dc5998?, 0xc02025e9c0?}, {{{0xc0012e9d30?, 0xb?}, {0xc0012c58f0?, 0x0?}}})
      2025-05-21T07:10:53.734214425Z     /remote-source/cluster-logging-operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/controller/controller.go:119 +0xb7
      2025-05-21T07:10:53.734223088Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000322140, {0x1dc59d0, 0xc0000f5270}, {0x1916200?, 0xc00007eae0?})
      2025-05-21T07:10:53.734232496Z     /remote-source/cluster-logging-operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/controller/controller.go:316 +0x3cc
      2025-05-21T07:10:53.734241791Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000322140, {0x1dc59d0, 0xc0000f5270})
      2025-05-21T07:10:53.734260030Z     /remote-source/cluster-logging-operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/controller/controller.go:266 +0x1af
      2025-05-21T07:10:53.734268991Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
      2025-05-21T07:10:53.734268991Z     /remote-source/cluster-logging-operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/controller/controller.go:227 +0x79
      2025-05-21T07:10:53.734278512Z created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 108
      2025-05-21T07:10:53.734278512Z     /remote-source/cluster-logging-operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/controller/controller.go:223 +0x565
      
      

      2)  OLM try to reinstall the same 5.9.3 , but failed 

      2025-05-20T21:15:56.349536268Z time="2025-05-20T21:15:56Z" level=warning msg="install timed out" csv=cluster-logging.v5.9.3 id=10KfP namespace=openshift-logging phase=Installing 2025-05-20T21:15:56.349688944Z I0520 21:15:56.349635 1 event.go:364] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"openshift-logging", Name:"cluster-logging.v5.9.3", UID:"84d139de-f045-4b57-a9b7-413a072d516f", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"3342785258", FieldPath:""}): type: 'Warning' reason: 'InstallCheckFailed' install timeout 2025-05-20T21:15:56.989629579Z time="2025-05-20T21:15:56Z" level=warning msg="needs reinstall: waiting for deployment cluster-logging-operator to become ready: deployment \"cluster-logging-operator\" not available: Deployment does not have minimum availability." csv=cluster-logging.v5.9.3 id=4tg1a namespace=openshift-logging phase=Failed strategy=deployment 
      
      
      2025-05-20T21:15:56.989720369Z I0520 21:15:56.989658 1 event.go:364] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"openshift-logging", Name:"cluster-logging.v5.9.3", UID:"84d139de-f045-4b57-a9b7-413a072d516f", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"3342794926", FieldPath:""}): type: 'Normal' reason: 'NeedsReinstall' installing: waiting for deployment cluster-logging-operator to become ready: deployment "cluster-logging-operator" not available: Deployment does not have minimum availability. 2025-05-20T21:15:57.735636808Z time="2025-05-20T21:15:57Z" level=info msg="scheduling ClusterServiceVersion for install" csv=cluster-logging.v5.9.3 id=FYuL1 namespace=openshift-logging phase=Pending 
      
      
      2025-05-20T21:15:57.735744720Z I0520 21:15:57.735717 1 event.go:364] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"openshift-logging", Name:"cluster-logging.v5.9.3", UID:"84d139de-f045-4b57-a9b7-413a072d516f", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"3342794942", FieldPath:""}): type: 'Normal' reason: 'AllRequirementsMet' all requirements found, attempting install 2025-05-20T21:15:58.366345906Z I0520 21:15:58.366286 1 event.go:364] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"openshift-logging", Name:"cluster-logging.v5.9.3", UID:"84d139de-f045-4b57-a9b7-413a072d516f", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"3342794977", FieldPath:""}): type: 'Normal' reason: 'InstallSucceeded' waiting for install components to report healthy 2025-05-20T21:15:58.949735302Z time="2025-05-20T21:15:58Z" level=info msg="install strategy successful" csv=cluster-logging.v5.9.3 id=DgZGm namespace=openshift-logging phase=Installing strategy=deployment 
      
      
      2025-05-20T21:15:58.949881279Z I0520 21:15:58.949853 1 event.go:364] Event(v1.ObjectReference{Kind:"ClusterServiceVersion", Namespace:"openshift-logging", Name:"cluster-logging.v5.9.3", UID:"84d139de-f045-4b57-a9b7-413a072d516f", APIVersion:"operators.coreos.com/v1alpha1", ResourceVersion:"3342795004", FieldPath:""}): type: 'Normal' reason: 'InstallWaiting' installing: waiting for deployment cluster-logging-operator to become ready: deployment "cluster-logging-operator" not available: Deployment does not have minimum availability.

       

      ===Action Take

       

      Since restarting cluster-logging operator pods not solving , customer reinstalled the operator , solved the issue.

      Also found another KCS

      https://access.redhat.com/solutions/6685111

      have similar CLO panic , but detailed stack trace are diffrence.

       

      Please have a check to tell us the root cause of this,  if there are any more information needs please let us know.

       

      Thank you!

              vparfono Vitalii Parfonov
              rhn-support-jayu Jacob Yu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: