-
Bug
-
Resolution: Done
-
Major
-
ACM 2.10.0, ACM 2.9.0, ACM 2.8.0, ACM 2.7.0
-
False
-
-
False
-
-
-
Submariner Sprint 2024-22, Submariner Sprint 2024-23, Submariner Sprint 2024-24, Submariner Sprint 2024-25, Submariner Sprint 2024-26, Submariner Sprint 2024-27
-
Moderate
-
No
Description of problem:
While deploying 3000+ SNOs with ACM and ZTP, the submariner-addon pod was crashlooping at the conclusion of the test. Reviewing when it started crashlooping it actually appears that it started when the first cluster finished provisioning and became managed. (Perhaps there is something wrong with what that pod is examining since in order to achieve more than 3000 managed SNOs, SNOs are provisioned in 500 cluster "steps")
Version-Release number of selected component (if applicable):
2.7.0-DOWNSTREAM-2023-01-03-01-19-39
OCP 4.11.19 Hub and SNOs
How reproducible:
Steps to Reproduce:
- ...
Actual results:
Expected results:
Additional info:
# oc get po -n open-cluster-management -l app=submariner-addon
NAME READY STATUS RESTARTS AGE
submariner-addon-7cdfb67b4c-c9hsw 0/1 CrashLoopBackOff 168 (4m5s ago) 22h
# oc describe po -n open-cluster-management -l app=submariner-addon
Name: submariner-addon-7cdfb67b4c-c9hsw
Namespace: open-cluster-management
Priority: 0
Node: e27-h03-000-r650/fc00:1002::6
Start Time: Tue, 03 Jan 2023 21:31:51 +0000
Labels: app=submariner-addon
pod-template-hash=7cdfb67b4c
Annotations: alm-examples:
[{"apiVersion": "operator.open-cluster-management.io/v1", "kind": "MultiClusterHub", "metadata": {"name": "multiclusterhub", "namespace": ...
capabilities: Seamless Upgrades
categories: Integration & Delivery
certified: true
createdAt: 2023-01-03T15:47:42Z
description: Advanced provisioning and management of OpenShift and Kubernetes clusters
k8s.ovn.org/pod-networks:
{"default":{"ip_addresses":["fd01:0:0:2::47/64"],"mac_address":"0a:58:4b:8a:e9:86","gateway_ips":["fd01:0:0:2::1"],"ip_address":"fd01:0:0:...
k8s.v1.cni.cncf.io/network-status:
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"fd01:0:0:2::47"
],
"mac": "0a:58:4b:8a:e9:86",
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status:
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"fd01:0:0:2::47"
],
"mac": "0a:58:4b:8a:e9:86",
"default": true,
"dns": {}
}]
olm.operatorGroup: default
olm.operatorNamespace: open-cluster-management
olm.skipRange: >=2.6.0 <2.7.0
olm.targetNamespaces: open-cluster-management
openshift.io/scc: restricted-v2
operatorframework.io/initialization-resource:
{"apiVersion":"operator.open-cluster-management.io/v1", "kind":"MultiClusterHub","metadata":{"name":"multiclusterhub","namespace":"open-cl...
operatorframework.io/properties:
{"properties":[{"type":"olm.gvk","value":{"group":"submarineraddon.open-cluster-management.io","kind":"SubmarinerConfig","version":"v1alph...
operatorframework.io/suggested-namespace: open-cluster-management
operators.openshift.io/infrastructure-features: ["disconnected", "proxy-aware", "fips"]
operators.openshift.io/valid-subscription: ["OpenShift Platform Plus", "Red Hat Advanced Cluster Management for Kubernetes"]
operators.operatorframework.io/internal-objects:
["observatoria.core.observatorium.io", "observabilityaddons.observability.open-cluster-management.io"]
seccomp.security.alpha.kubernetes.io/pod: runtime/default
support: Red Hat
Status: Running
IP: fd01:0:0:2::47
IPs:
IP: fd01:0:0:2::47
Controlled By: ReplicaSet/submariner-addon-7cdfb67b4c
Containers:
submariner-addon:
Container ID: cri-o://dee8846c89430625616784b9912504c61b77ba11da49a4578ab181eb9dcf70d7
Image: registry.redhat.io/rhacm2/submariner-addon-rhel8@sha256:79826d86770432e3e548f6400ba2251b5787258028fda68d4c308c0d27ae9a44
Image ID: registry.redhat.io/rhacm2/submariner-addon-rhel8@sha256:79826d86770432e3e548f6400ba2251b5787258028fda68d4c308c0d27ae9a44
Port: <none>
Host Port: <none>
Args:
/submariner
controller
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Wed, 04 Jan 2023 20:11:48 +0000
Finished: Wed, 04 Jan 2023 20:14:25 +0000
Ready: False
Restart Count: 168
Limits:
memory: 270Mi
Requests:
cpu: 100m
memory: 128Mi
Liveness: http-get https://:8443/healthz delay=2s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:8443/healthz delay=2s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: submariner-addon-7cdfb67b4c-c9hsw (v1:metadata.name)
OPERATOR_CONDITION_NAME: advanced-cluster-management.v2.7.0
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rf5b8 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-rf5b8:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 167m kubelet Liveness probe failed: Get "https://[fd01:0:0:2::47]:8443/healthz": EOF
Warning Unhealthy 105m (x8 over 15h) kubelet Liveness probe failed: Get "https://[fd01:0:0:2::47]:8443/healthz": dial tcp [fd01:0:0:2::47]:8443: connect: connection refused
Warning BackOff 2m43s (x4110 over 21h) kubelet Back-off restarting failed container
- is related to
-
ACM-2850 Upgrading from ACM 2.6 to 2.7 failed
-
- Closed
-