Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-6619

TALM pod crashes when manged policy in CGU does not have annotation

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • 4.10.z
    • TALM Operator
    • Moderate
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      TALM pod crashes when managedpolicy in CGU does not have any annotation.
      
      openshift-operators                                cluster-group-upgrades-controller-manager-5db65bc474-jlbvf                               1/2     CrashLoopBackOff   6 (3m ago)    11m
      
      
      panic: assignment to entry in nil mapgoroutine 581 [running]:
      github.com/openshift-kni/cluster-group-upgrades-operator/controllers.(*ClusterGroupUpgradeReconciler).copyManagedInformPolicy(0xc000a31480, {0x18cbaf8, 0xc000768c90}, 0xc000614700, 0xc000370128)
          /workspace/controllers/clustergroupupgrade_controller.go:907 +0x4c8
      github.com/openshift-kni/cluster-group-upgrades-operator/controllers.(*ClusterGroupUpgradeReconciler).reconcileResources(0xc000a31480, {0x18cbaf8, 0xc000768c90}, 0xc000614700, {0xc0001803c8, 0x1, 0xc0001803c8})
          /workspace/controllers/clustergroupupgrade_controller.go:1432 +0x9a
      github.com/openshift-kni/cluster-group-upgrades-operator/controllers.(*ClusterGroupUpgradeReconciler).Reconcile(0xc000a31480, {0x18cbaf8, 0xc000768c90}, {{{0xc000911470, 0x9}, {0xc00066d0e0, 0x1d}}})
          /workspace/controllers/clustergroupupgrade_controller.go:212 +0x19db
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0004eb180, {0x18cba50, 0xc0006a6140}, {0x154bd20, 0xc000e8c000})
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298 +0x303
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0004eb180, {0x18cba50, 0xc0006a6140})
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 +0x205
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 +0x85
      created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 +0x354
      

      Version-Release number of selected component (if applicable):

      4.10.0-202212061900

      How reproducible:

      100%

      Steps to Reproduce:

      1. create a policy without any annotation
      2. create a CGU with above policy
      
      

      Actual results:

      talm pod crashed

      Expected results:

      CGU enabled successfully

      Additional info:

      talm pod logs: 
      [kni@registry.kni-qe-27 ~]$ oc logs -n openshift-operators cluster-group-upgrades-controller-manager-5db65bc474-jlbvf manager 
      I0124 20:00:57.542085       1 request.go:668] Waited for 1.03274998s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/performance.openshift.io/v1alpha1?timeout=32s
      2023-01-24T20:01:01.151Z    INFO    controller-runtime.metrics    metrics server is starting to listen    {"addr": "127.0.0.1:8080"}
      2023-01-24T20:01:01.163Z    INFO    setup    starting manager
      I0124 20:01:01.163786       1 leaderelection.go:243] attempting to acquire leader lease openshift-operators/9a2365a3.openshift.io...
      2023-01-24T20:01:01.163Z    INFO    controller-runtime.manager    starting metrics server    {"path": "/metrics"}
      I0124 20:01:17.292760       1 leaderelection.go:253] successfully acquired lease openshift-operators/9a2365a3.openshift.io
      2023-01-24T20:01:17.292Z    DEBUG    controller-runtime.manager.events    Normal    {"object": {"kind":"ConfigMap","namespace":"openshift-operators","name":"9a2365a3.openshift.io","uid":"6511ff2a-ca48-4863-b5a2-39a5a7d609cf","apiVersion":"v1","resourceVersion":"26287890"}, "reason": "LeaderElection", "message": "cluster-group-upgrades-controller-manager-5db65bc474-jlbvf_0a0fe4fb-f23a-4481-89cf-c0193a8ff191 became leader"}
      2023-01-24T20:01:17.293Z    INFO    controller-runtime.manager.controller.clustergroupupgrade    Starting EventSource    {"reconciler group": "ran.openshift.io", "reconciler kind": "ClusterGroupUpgrade", "source": "kind source: /, Kind="}
      2023-01-24T20:01:17.293Z    INFO    controller-runtime.manager.controller.clustergroupupgrade    Starting EventSource    {"reconciler group": "ran.openshift.io", "reconciler kind": "ClusterGroupUpgrade", "source": "kind source: policy.open-cluster-management.io/v1, Kind=Policy"}
      2023-01-24T20:01:17.293Z    INFO    controller-runtime.manager.controller.clustergroupupgrade    Starting Controller    {"reconciler group": "ran.openshift.io", "reconciler kind": "ClusterGroupUpgrade"}
      2023-01-24T20:01:17.293Z    INFO    controller-runtime.manager.controller.managedclusterForCGU    Starting EventSource    {"reconciler group": "cluster.open-cluster-management.io", "reconciler kind": "ManagedCluster", "source": "kind source: /, Kind="}
      2023-01-24T20:01:17.292Z    DEBUG    controller-runtime.manager.events    Normal    {"object": {"kind":"Lease","namespace":"openshift-operators","name":"9a2365a3.openshift.io","uid":"c5de1fa9-3847-4b44-8482-9372195e9db6","apiVersion":"coordination.k8s.io/v1","resourceVersion":"26287891"}, "reason": "LeaderElection", "message": "cluster-group-upgrades-controller-manager-5db65bc474-jlbvf_0a0fe4fb-f23a-4481-89cf-c0193a8ff191 became leader"}
      2023-01-24T20:01:17.293Z    INFO    controller-runtime.manager.controller.managedclusterForCGU    Starting EventSource    {"reconciler group": "cluster.open-cluster-management.io", "reconciler kind": "ManagedCluster", "source": "kind source: /, Kind="}
      2023-01-24T20:01:17.293Z    INFO    controller-runtime.manager.controller.managedclusterForCGU    Starting Controller    {"reconciler group": "cluster.open-cluster-management.io", "reconciler kind": "ManagedCluster"}
      2023-01-24T20:01:17.394Z    INFO    controller-runtime.manager.controller.clustergroupupgrade    Starting workers    {"reconciler group": "ran.openshift.io", "reconciler kind": "ClusterGroupUpgrade", "worker count": 1}
      2023-01-24T20:01:17.394Z    INFO    controllers.ClusterGroupUpgrade    Start reconciling CGU    {"name": "talm-test/generated-cgu-blocking-b-fail"}
      2023-01-24T20:01:17.395Z    INFO    controller-runtime.manager.controller.managedclusterForCGU    Starting workers    {"reconciler group": "cluster.open-cluster-management.io", "reconciler kind": "ManagedCluster", "worker count": 1}
      2023-01-24T20:01:17.395Z    INFO    controllers.ManagedClusterForCGU    Reconciling managedCluster to create clusterGroupUpgrade    {"Request.Name": "local-cluster"}
      2023-01-24T20:01:17.395Z    INFO    controllers.ManagedClusterForCGU    cluster is ready    {"Name": "local-cluster"}
      2023-01-24T20:01:17.494Z    INFO    controllers.ClusterGroupUpgrade    Loaded CGU    {"name": "talm-test/generated-cgu-blocking-b-fail", "version": "26276151"}
      2023-01-24T20:01:17.495Z    INFO    controllers.ClusterGroupUpgrade    [doManagedPoliciesExist]    {"policyMap": {"cnfdf34-config-policy":"ztp-site","common-config-policy":"ztp-common","common-subscriptions-policy":"ztp-common","generated-policy-blocking-a-fail":"talm-test","generated-policy-blocking-b-fail":"talm-test","group-du-sno-config-policy":"ztp-group"}}
      2023-01-24T20:01:17.495Z    INFO    controllers.ManagedClusterForCGU    WARN: No child policies found for cluster    {"Name": "local-cluster"}
      2023-01-24T20:01:17.496Z    INFO    controllers.ManagedClusterForCGU    Reconciling managedCluster to create clusterGroupUpgrade    {"Request.Name": "cnfdf34"}
      2023-01-24T20:01:17.496Z    INFO    controllers.ManagedClusterForCGU    ZTP for the cluster has completed. ztp-done label found.    {"Name": "cnfdf34"}
      2023-01-24T20:01:17.499Z    INFO    controllers.ClusterGroupUpgrade    [getClustersNonCompliantWithPolicy]    {"policy: ": "generated-policy-blocking-b-fail", "clusters: ": ["cnfdf34"]}
      2023-01-24T20:01:17.499Z    INFO    controllers.ClusterGroupUpgrade    Remediation plan    {"remediatePlan": [["cnfdf34"]]}
      2023-01-24T20:01:17.499Z    INFO    controllers.ClusterGroupUpgrade    Finish reconciling CGU    {"name": "talm-test/generated-cgu-blocking-b-fail", "requeueRightAway": false}
      panic: assignment to entry in nil mapgoroutine 617 [running]:
      github.com/openshift-kni/cluster-group-upgrades-operator/controllers.(*ClusterGroupUpgradeReconciler).copyManagedInformPolicy(0xc000524040, {0x18cbaf8, 0xc000839410}, 0xc0009aa700, 0xc0002e8210)
          /workspace/controllers/clustergroupupgrade_controller.go:907 +0x4c8
      github.com/openshift-kni/cluster-group-upgrades-operator/controllers.(*ClusterGroupUpgradeReconciler).reconcileResources(0xc000524040, {0x18cbaf8, 0xc000839410}, 0xc0009aa700, {0xc0002e83b0, 0x1, 0xc0002e83b0})
          /workspace/controllers/clustergroupupgrade_controller.go:1432 +0x9a
      github.com/openshift-kni/cluster-group-upgrades-operator/controllers.(*ClusterGroupUpgradeReconciler).Reconcile(0xc000524040, {0x18cbaf8, 0xc000839410}, {{{0xc000c97280, 0x9}, {0xc0004b1b00, 0x1d}}})
          /workspace/controllers/clustergroupupgrade_controller.go:212 +0x19db
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0002c1e00, {0x18cba50, 0xc00016e080}, {0x154bd20, 0xc00032ad00})
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298 +0x303
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0002c1e00, {0x18cba50, 0xc00016e080})
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 +0x205
      sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 +0x85
      created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
          /workspace/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 +0x354
      

            jche@redhat.com Jun Chen
            rhn-support-yliu1 Yang Liu
            Yang Liu Yang Liu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: