-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
4.14
-
None
-
Critical
-
No
-
Approved
-
False
-
Description of problem:
When debuging OCPBUGS-18389, we found many errors related to the file in mount volume, such as: 1. 2023-09-12T06:19:29.972173907Z 2023-09-12T06:19:29Z [error] Multus: [node-density/node-density-2787/ee2d38a7-d1bb-439f-84da-8827f95a6ce6]: have you checked that your default network is ready? still waiting for readinessindicatorfile @ /host/run/multus/cni/net.d/80-openshift-network.conf. pollimmediate error: timed out waiting for the condition 2. 2023-09-12T06:19:31.447112932Z E0912 06:19:31.447093 1994 token_source.go:180] Unable to rotate token: failed to read token file "/var/run/secrets/kubernetes.io/serviceaccount/token": open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory These error cause the failure of CNI_ADD of pods, thus increase the latency of pod creation. And this symtom only happens on cluster using OpenShiftSDN as CNI plugin. Please find the MustGather bellow
OCP Version | Flexy Id | Scale Ci Job | Grafana URL | Cloud | Arch Type | Network Type | Worker Count | PODS_PER_NODE | Avg Pod Ready (ms) | P99 Pod Ready (ms) | Must-gather |
4.14.0-0.nightly-2023-09-02-132842 | 231558 | 291 | 62404e34-672e-4168-b4cc-0bd575768aad | aws | amd64 | SDN | 24 | 245 | 58725 | 294279 | https://drive.google.com/file/d/1BbVeNrWzVdogFhYihNfv-99_q8oj6eCN/view?usp=drive_link |
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2023-09-02-132842
How reproducible:
This issue happened when running pod density test. But we can also see the symptom when deploy many (>50) pods to a node.
Steps to Reproduce:
1. Create a OCP cluster 2. Leave one worker, cordon other worker nodes, 3. kubectl create deployment my-dep --image=quay.io/jitesoft/nginx --replicas=50 4. check the log of multus pod on that node.
Actual results:
In the log there are many errors like: 1. 2023-09-12T06:19:29.972173907Z 2023-09-12T06:19:29Z [error] Multus: [node-density/node-density-2787/ee2d38a7-d1bb-439f-84da-8827f95a6ce6]: have you checked that your default network is ready? still waiting for readinessindicatorfile @ /host/run/multus/cni/net.d/80-openshift-network.conf. pollimmediate error: timed out waiting for the condition 2. 2023-09-12T06:19:31.447112932Z E0912 06:19:31.447093 1994 token_source.go:180] Unable to rotate token: failed to read token file "/var/run/secrets/kubernetes.io/serviceaccount/token": open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
Expected results:
Additional info:
- blocks
-
OCPBUGS-18995 SDN: 4.14 after ec4 has a higher pod ready latency compared to 4.13.10
- Closed
-
OCPBUGS-19642 SDN: 4.14 after ec4 has a higher pod ready latency compared to 4.13.10
- Closed
- links to