-
Bug
-
Resolution: Done
-
Major
-
ACM 2.8.4
-
2
-
False
-
None
-
False
-
-
-
-
GRC Sprint 2023-21
-
Important
-
-
-
No
Description of problem:
Attempting to deploy SNO spoke with OCP 4.14.3 and Telco DU profile fails from hub cluster running OCP 4.14.z and ACM 2.8.4
config-policy-controller pod on spoke is in CrashLoopBackup status
Seen in multiple test envs
The pod issue does not occur if ACM 2.9 is on the hub instead
Version-Release number of selected component (if applicable):
OCP (hub and spoke) 4.14.3
ACM 2.8.4-xx
How reproducible:
Always
Steps to Reproduce:
- Install hub with OCP 4.14.3 and ACM 2.8.4
- Deploy spoke with OCP 4.14.3
- Check deployment status and check config-policy-controller pod and logs on spoke.
Actual results:
spoke does not finish deployment (policy violations) and config-policy-controller pod on spoke is in CrashLoopBackup status
Expected results:
spoke deploys successfully and config-policy-controller pod on spoke runs properly
Additional info:
[kni@registry.ran-vcl01 dgonyier]$ oc get pods -A | grep -vE 'NAME|Comp|Runn' open-cluster-management-agent-addon config-policy-controller-86dcccd679-z2xh5 1/2 CrashLoopBackOff 41 (2m43s ago) 3h13m [kni@registry.ran-vcl01 dgonyier]$ oc logs -n open-cluster-management-agent-addon config-policy-controller-86dcccd679-z2xh5 2023-11-21T21:52:24.349Z info setup app/main.go:61 Using {"OperatorVersion": "0.0.1", "GoVersion": "go1.20.10 X:strictfipsruntime", "GOOS": "linux", "GOARCH": "amd64"} 2023-11-21T21:52:26.892Z info controller-runtime.metrics metrics/listener.go:44 Metrics server is starting to listen {"addr": "localhost:8383"} 2023-11-21T21:52:26.894Z info setup app/main.go:337 The managed cluster supports dry run API requests 2023-11-21T21:52:26.895Z info setup app/main.go:379 Periodically processing Configuration Policies {"frequency": 10} 2023-11-21T21:52:26.895Z info setup app/main.go:400 Starting lease controller to report status 2023-11-21T21:52:26.895Z info configuration-policy-controller controllers/configurationpolicy_controller.go:171 Waiting for leader election before periodically evaluating configuration policies 2023-11-21T21:52:26.897Z info setup app/main.go:420 Starting managers 2023-11-21T21:52:26.897Z info manager/internal.go:369 Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8383"} 2023-11-21T21:52:26.897Z info manager/internal.go:369 Starting server {"kind": "health probe", "addr": "[::]:8081"} 2023-11-21T21:52:26.897Z info controller/controller.go:186 Starting EventSource {"controller": "NamespaceSelector", "controllerGroup": "", "controllerKind": "Namespace", "source": "kind source: *v1.Namespace"} 2023-11-21T21:52:26.897Z info controller/controller.go:186 Starting EventSource {"controller": "NamespaceSelector", "controllerGroup": "", "controllerKind": "Namespace", "source": "kind source: *v1.Namespace"} 2023-11-21T21:52:26.897Z info controller/controller.go:194 Starting Controller {"controller": "NamespaceSelector", "controllerGroup": "", "controllerKind": "Namespace"} 2023-11-21T21:52:26.897Z info controller/controller.go:186 Starting EventSource {"controller": "configuration-policy-controller", "controllerGroup": "policy.open-cluster-management.io", "controllerKind": "ConfigurationPolicy", "source": "kind source: *v1.ConfigurationPolicy"} 2023-11-21T21:52:26.897Z info controller/controller.go:194 Starting Controller {"controller": "configuration-policy-controller", "controllerGroup": "policy.open-cluster-management.io", "controllerKind": "ConfigurationPolicy"} 2023-11-21T21:52:26.999Z info controller/controller.go:228 Starting workers {"controller": "configuration-policy-controller", "controllerGroup": "policy.open-cluster-management.io", "controllerKind": "ConfigurationPolicy", "worker count": 1} 2023-11-21T21:52:26.999Z info controller/controller.go:228 Starting workers {"controller": "NamespaceSelector", "controllerGroup": "", "controllerKind": "Namespace", "worker count": 1} 2023-11-21T21:52:31.220Z info configuration-policy-controller controllers/configurationpolicy_controller.go:2618 Ignoring an update to the object status {"policy": "cnfde4-group-du-sno-validator-du-policy-config-g6954", "name": "master", "namespace": "", "resource": "machineconfigpools", "key": "status"} 2023-11-21T21:52:31.230Z info configuration-policy-controller controllers/configurationpolicy_controller.go:1523 The object template does not specify a name. Setting the remediation action to inform. {"policy": "cnfde4-group-du-sno-validator-du-policy-config-g6954", "index": 1, "objectNamespace": "openshift-sriov-network-operator", "oldRemediationAction": "enforce"} panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2380072] goroutine 974 [running]: k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.(*Unstructured).GetUID(...) /remote-source/deps/gomod/pkg/mod/k8s.io/apimachinery@v0.27.1/pkg/apis/meta/v1/unstructured/unstructured.go:270 open-cluster-management.io/config-policy-controller/controllers.(*ConfigurationPolicyReconciler).alreadyEvaluated(0x2b0c4e8?, 0xc0003f7880?, 0x0) /remote-source/app/controllers/configurationpolicy_controller.go:2781 +0xb2 open-cluster-management.io/config-policy-controller/controllers.(*ConfigurationPolicyReconciler).handleSingleObj(0xdb26b0?, {0xc00017fe00, {{0xc000927dc0, 0x19}, {0xc000907b52, 0x2}, {0xc000cd0f00, 0x16}}, 0x0, {0xc0006922a0, ...}, ...}, ...) /remote-source/app/controllers/configurationpolicy_controller.go:1748 +0x765 open-cluster-management.io/config-policy-controller/controllers.(*ConfigurationPolicyReconciler).handleObjects(0xc00019fb00, 0xc0002d9ea0, {0xc00012d280, 0x20}, {{0xc000d38378, 0x15}, {0x0, 0x0}, {0xc00012d280, 0x20}, ...}, ...) /remote-source/app/controllers/configurationpolicy_controller.go:1555 +0xf25 open-cluster-management.io/config-policy-controller/controllers.(*ConfigurationPolicyReconciler).handleObjectTemplates(_, {{{0x23d53a2, 0x13}, {0xc000d07bc0, 0x24}}, {{0xc0000566c0, 0x34}, {0x0, 0x0}, {0xc00092cd56, ...}, ...}, ...}) /remote-source/app/controllers/configurationpolicy_controller.go:1159 +0x2c11 open-cluster-management.io/config-policy-controller/controllers.(*ConfigurationPolicyReconciler).handlePolicyWorker(0x1d12960?, 0xc0007695a0?, 0xc000c66fb8?) /remote-source/app/controllers/configurationpolicy_controller.go:301 +0x1e5 created by open-cluster-management.io/config-policy-controller/controllers.(*ConfigurationPolicyReconciler).PeriodicallyExecConfigPolicies /remote-source/app/controllers/configurationpolicy_controller.go:243 +0x76c [kni@registry.ran-vcl01 dgonyier]$ source acm-operator-bundle "registry-proxy.engineering.redhat.com/rh-osbs/rhacm2-acm-operator-bundle:v2.8.4-8" [kni@registry.ran-vcl01 dgonyier]$ source talm-bundle-current https://access.redhat.com/containers/#/registry.access.redhat.com/openshift4/topology-aware-lifecycle-manager-operator-bundle-container-rhel8/images/v4.14.1-6
- is cloned by
-
ACM-8732 [2.9] SIGSEGV error code in config-policy-controller pod on spoke deployed from hub with OCP 4.14.3, ACM 2.8.4
- Closed