- 
    
Bug
 - 
    Resolution: Done-Errata
 - 
    
Major
 - 
    4.13.z, 4.12, 4.14.0
 
- 
        Quality / Stability / Reliability
 - 
        False
 - 
        
 - 
        None
 - 
        Moderate
 - 
        No
 
- 
        None
 - 
        Approved
 - 
        SDN Sprint 241, SDN Sprint 242
 - 
        2
 
- 
        +
 
- 
        None
 - 
        None
 - 
        None
 
- 
        None
 - 
        None
 - 
        None
 - 
        None
 
Description of problem:
Pod sometimes doesn’t work as expected when it has the same name with previous pods on OVN network cluster
Version-Release number of selected component (if applicable):
4.12.0-0.nightly-2023-09-05-064152
How reproducible:
Always, but need try more times
Steps to Reproduce:
1.Create a machineset
liuhuali@Lius-MacBook-Pro huali-test % oc create -f ms1.yaml 
machineset.machine.openshift.io/huliu-nu96a-zn7mc-workera created
liuhuali@Lius-MacBook-Pro huali-test % oc get machine
NAME                              PHASE     TYPE   REGION    ZONE              AGE
huliu-nu96a-zn7mc-master-0        Running   AHV    Unnamed   Development-LTS   6h14m
huliu-nu96a-zn7mc-master-1        Running   AHV    Unnamed   Development-LTS   6h14m
huliu-nu96a-zn7mc-master-2        Running   AHV    Unnamed   Development-LTS   6h14m
huliu-nu96a-zn7mc-worker-5j47v    Running   AHV    Unnamed   Development-LTS   6h9m
huliu-nu96a-zn7mc-worker-thprs    Running   AHV    Unnamed   Development-LTS   6h9m
huliu-nu96a-zn7mc-workera-x54mr   Running   AHV    Unnamed   Development-LTS   6m50s
liuhuali@Lius-MacBook-Pro huali-test % oc get node                                          
NAME                              STATUS   ROLES                  AGE     VERSION
huliu-nu96a-zn7mc-master-0        Ready    control-plane,master   6h12m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-1        Ready    control-plane,master   6h12m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-2        Ready    control-plane,master   6h12m   v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-5j47v    Ready    worker                 6h      v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-thprs    Ready    worker                 6h      v1.25.12+26bab08
huliu-nu96a-zn7mc-workera-x54mr   Ready    worker                 3m7s    v1.25.12+26bab08 
2.Create a pod on the new node
liuhuali@Lius-MacBook-Pro huali-test % oc create -f kubelet-killer2.yaml
pod/kubelet-killer created
liuhuali@Lius-MacBook-Pro huali-test % cat kubelet-killer2.yaml
apiVersion: v1
kind: Pod
metadata:
  labels:
    kubelet-killer: ""
  name: kubelet-killer
  namespace: openshift-machine-api
spec:
  containers:
  - command:
    - pkill
    - -STOP
    - kubelet
    image: quay.io/openshifttest/base-alpine@sha256:3126e4eed4a3ebd8bf972b2453fa838200988ee07c01b2251e3ea47e4b1f245c
    imagePullPolicy: Always
    name: kubelet-killer
    securityContext:
      privileged: true
  enableServiceLinks: true
  hostPID: true
  nodeName: huliu-nu96a-zn7mc-workera-x54mr
  restartPolicy: Never
liuhuali@Lius-MacBook-Pro huali-test % 
3.The pod worked as expected
liuhuali@Lius-MacBook-Pro huali-test % oc get node   
NAME                              STATUS     ROLES                  AGE     VERSION
huliu-nu96a-zn7mc-master-0        Ready      control-plane,master   6h13m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-1        Ready      control-plane,master   6h14m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-2        Ready      control-plane,master   6h13m   v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-5j47v    Ready      worker                 6h2m    v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-thprs    Ready      worker                 6h2m    v1.25.12+26bab08
huliu-nu96a-zn7mc-workera-x54mr   NotReady   worker                 4m43s   v1.25.12+26bab08
liuhuali@Lius-MacBook-Pro huali-test % oc describe pod kubelet-killer  
Name:         kubelet-killer
Namespace:    openshift-machine-api
Priority:     0
Node:         huliu-nu96a-zn7mc-workera-x54mr/10.0.132.101
Start Time:   Wed, 06 Sep 2023 15:33:43 +0800
Labels:       kubelet-killer=
Annotations:  k8s.ovn.org/pod-networks:
                {"default":{"ip_addresses":["10.130.8.7/23"],"mac_address":"0a:58:0a:82:08:07","gateway_ips":["10.130.8.1"],"ip_address":"10.130.8.7/23","...
              k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "ovn-kubernetes",
                    "interface": "eth0",
                    "ips": [
                        "10.130.8.7"
                    ],
                    "mac": "0a:58:0a:82:08:07",
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "ovn-kubernetes",
                    "interface": "eth0",
                    "ips": [
                        "10.130.8.7"
                    ],
                    "mac": "0a:58:0a:82:08:07",
                    "default": true,
                    "dns": {}
                }]
              openshift.io/scc: privileged
Status:       Pending
IP:           
IPs:          <none>
Containers:
  kubelet-killer:
    Container ID:  
    Image:         quay.io/openshifttest/base-alpine@sha256:3126e4eed4a3ebd8bf972b2453fa838200988ee07c01b2251e3ea47e4b1f245c
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      pkill
      -STOP
      kubelet
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nm9vd (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-nm9vd:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason          Age   From     Message
  ----    ------          ----  ----     -------
  Normal  AddedInterface  90s   multus   Add eth0 [10.130.8.7/23] from ovn-kubernetes
  Normal  Pulling         90s   kubelet  Pulling image "quay.io/openshifttest/base-alpine@sha256:3126e4eed4a3ebd8bf972b2453fa838200988ee07c01b2251e3ea47e4b1f245c"
  Normal  Pulled          87s   kubelet  Successfully pulled image "quay.io/openshifttest/base-alpine@sha256:3126e4eed4a3ebd8bf972b2453fa838200988ee07c01b2251e3ea47e4b1f245c" in 2.310348601s (2.310355399s including waiting)
  Normal  Created         87s   kubelet  Created container kubelet-killer
liuhuali@Lius-MacBook-Pro huali-test % oc get machine
NAME                              PHASE     TYPE   REGION    ZONE              AGE
huliu-nu96a-zn7mc-master-0        Running   AHV    Unnamed   Development-LTS   6h17m
huliu-nu96a-zn7mc-master-1        Running   AHV    Unnamed   Development-LTS   6h17m
huliu-nu96a-zn7mc-master-2        Running   AHV    Unnamed   Development-LTS   6h17m
huliu-nu96a-zn7mc-worker-5j47v    Running   AHV    Unnamed   Development-LTS   6h11m
huliu-nu96a-zn7mc-worker-thprs    Running   AHV    Unnamed   Development-LTS   6h11m
huliu-nu96a-zn7mc-workera-x54mr   Running   AHV    Unnamed   Development-LTS   9m5s
liuhuali@Lius-MacBook-Pro huali-test % oc get pod
NAME                                                  READY   STATUS              RESTARTS   AGE
cluster-autoscaler-operator-854c6755f5-r9c2k          2/2     Running             0          5h41m
cluster-baremetal-operator-976487bc9-7czpk            2/2     Running             0          5h41m
control-plane-machine-set-operator-69684bcccd-c6jnf   1/1     Running             0          5h41m
kubelet-killer                                        0/1     ContainerCreating   0          98s
machine-api-controllers-7f574b69b5-w5swt              7/7     Running             0          155m
machine-api-operator-7f46db4fcc-v6w9p                 2/2     Running             0          5h41m
4.Try this once again. Delete the old machine and let it recreate a new one
liuhuali@Lius-MacBook-Pro huali-test % oc delete machine huliu-nu96a-zn7mc-workera-x54mr
machine.machine.openshift.io "huliu-nu96a-zn7mc-workera-x54mr" deleted
liuhuali@Lius-MacBook-Pro huali-test % oc get pod
NAME                                                  READY   STATUS        RESTARTS   AGE
cluster-autoscaler-operator-854c6755f5-r9c2k          2/2     Running       0          5h42m
cluster-baremetal-operator-976487bc9-7czpk            2/2     Running       0          5h42m
control-plane-machine-set-operator-69684bcccd-c6jnf   1/1     Running       0          5h42m
kubelet-killer                                        0/1     Terminating   0          2m28s
machine-api-controllers-7f574b69b5-w5swt              7/7     Running       0          156m
machine-api-operator-7f46db4fcc-v6w9p                 2/2     Running       0          5h42m
liuhuali@Lius-MacBook-Pro huali-test % oc get machine
NAME                              PHASE          TYPE   REGION    ZONE              AGE
huliu-nu96a-zn7mc-master-0        Running        AHV    Unnamed   Development-LTS   6h18m
huliu-nu96a-zn7mc-master-1        Running        AHV    Unnamed   Development-LTS   6h18m
huliu-nu96a-zn7mc-master-2        Running        AHV    Unnamed   Development-LTS   6h18m
huliu-nu96a-zn7mc-worker-5j47v    Running        AHV    Unnamed   Development-LTS   6h12m
huliu-nu96a-zn7mc-worker-thprs    Running        AHV    Unnamed   Development-LTS   6h12m
huliu-nu96a-zn7mc-workera-t8dj2   Provisioning                                      27s
liuhuali@Lius-MacBook-Pro huali-test % oc get pod                                       
NAME                                                  READY   STATUS    RESTARTS   AGE
cluster-autoscaler-operator-854c6755f5-r9c2k          2/2     Running   0          5h44m
cluster-baremetal-operator-976487bc9-7czpk            2/2     Running   0          5h44m
control-plane-machine-set-operator-69684bcccd-c6jnf   1/1     Running   0          5h44m
machine-api-controllers-7f574b69b5-w5swt              7/7     Running   0          158m
machine-api-operator-7f46db4fcc-v6w9p                 2/2     Running   0          5h44m
liuhuali@Lius-MacBook-Pro huali-test % oc get machine                                        
NAME                              PHASE     TYPE   REGION    ZONE              AGE
huliu-nu96a-zn7mc-master-0        Running   AHV    Unnamed   Development-LTS   6h27m
huliu-nu96a-zn7mc-master-1        Running   AHV    Unnamed   Development-LTS   6h27m
huliu-nu96a-zn7mc-master-2        Running   AHV    Unnamed   Development-LTS   6h27m
huliu-nu96a-zn7mc-worker-5j47v    Running   AHV    Unnamed   Development-LTS   6h21m
huliu-nu96a-zn7mc-worker-thprs    Running   AHV    Unnamed   Development-LTS   6h21m
huliu-nu96a-zn7mc-workera-t8dj2   Running   AHV    Unnamed   Development-LTS   9m46s
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME                              STATUS   ROLES                  AGE     VERSION
huliu-nu96a-zn7mc-master-0        Ready    control-plane,master   6h24m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-1        Ready    control-plane,master   6h25m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-2        Ready    control-plane,master   6h24m   v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-5j47v    Ready    worker                 6h13m   v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-thprs    Ready    worker                 6h13m   v1.25.12+26bab08
huliu-nu96a-zn7mc-workera-t8dj2   Ready    worker                 6m      v1.25.12+26bab08
5.Create a pod with the same name as the previous one (here is kubelet-killer) on the new node
liuhuali@Lius-MacBook-Pro huali-test % oc create -f kubelet-killer2.yaml
pod/kubelet-killer created
liuhuali@Lius-MacBook-Pro huali-test % cat kubelet-killer2.yaml
apiVersion: v1
kind: Pod
metadata:
  labels:
    kubelet-killer: ""
  name: kubelet-killer
  namespace: openshift-machine-api
spec:
  containers:
  - command:
    - pkill
    - -STOP
    - kubelet
    image: quay.io/openshifttest/base-alpine@sha256:3126e4eed4a3ebd8bf972b2453fa838200988ee07c01b2251e3ea47e4b1f245c
    imagePullPolicy: Always
    name: kubelet-killer
    securityContext:
      privileged: true
  enableServiceLinks: true
  hostPID: true
  nodeName: huliu-nu96a-zn7mc-workera-t8dj2
  restartPolicy: Never
6.Check the pod doesn’t work as expected.
liuhuali@Lius-MacBook-Pro huali-test % oc get machine
NAME                              PHASE     TYPE   REGION    ZONE              AGE
huliu-nu96a-zn7mc-master-0        Running   AHV    Unnamed   Development-LTS   6h35m
huliu-nu96a-zn7mc-master-1        Running   AHV    Unnamed   Development-LTS   6h35m
huliu-nu96a-zn7mc-master-2        Running   AHV    Unnamed   Development-LTS   6h35m
huliu-nu96a-zn7mc-worker-5j47v    Running   AHV    Unnamed   Development-LTS   6h29m
huliu-nu96a-zn7mc-worker-thprs    Running   AHV    Unnamed   Development-LTS   6h29m
huliu-nu96a-zn7mc-workera-t8dj2   Running   AHV    Unnamed   Development-LTS   17m
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME                              STATUS   ROLES                  AGE     VERSION
huliu-nu96a-zn7mc-master-0        Ready    control-plane,master   6h32m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-1        Ready    control-plane,master   6h33m   v1.25.12+26bab08
huliu-nu96a-zn7mc-master-2        Ready    control-plane,master   6h32m   v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-5j47v    Ready    worker                 6h21m   v1.25.12+26bab08
huliu-nu96a-zn7mc-worker-thprs    Ready    worker                 6h21m   v1.25.12+26bab08
huliu-nu96a-zn7mc-workera-t8dj2   Ready    worker                 14m     v1.25.12+26bab08
liuhuali@Lius-MacBook-Pro huali-test % oc get pod
NAME                                                  READY   STATUS              RESTARTS   AGE
cluster-autoscaler-operator-854c6755f5-r9c2k          2/2     Running             0          6h
cluster-baremetal-operator-976487bc9-7czpk            2/2     Running             0          6h
control-plane-machine-set-operator-69684bcccd-c6jnf   1/1     Running             0          6h
kubelet-killer                                        0/1     ContainerCreating   0          7m18s
machine-api-controllers-7f574b69b5-w5swt              7/7     Running             0          174m
machine-api-operator-7f46db4fcc-v6w9p                 2/2     Running             0          6h
liuhuali@Lius-MacBook-Pro huali-test % oc describe pod kubelet-killer  
Name:         kubelet-killer
Namespace:    openshift-machine-api
Priority:     0
Node:         huliu-nu96a-zn7mc-workera-t8dj2/10.0.132.67
Start Time:   Wed, 06 Sep 2023 15:46:29 +0800
Labels:       kubelet-killer=
Annotations:  openshift.io/scc: node-exporter
Status:       Pending
IP:           
IPs:          <none>
Containers:
  kubelet-killer:
    Container ID:  
    Image:         quay.io/openshifttest/base-alpine@sha256:3126e4eed4a3ebd8bf972b2453fa838200988ee07c01b2251e3ea47e4b1f245c
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      pkill
      -STOP
      kubelet
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dcq5h (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-dcq5h:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age    From          Message
  ----     ------                  ----   ----          -------
  Warning  ErrorAddingLogicalPort  7m30s  controlplane  deleteLogicalPort failed for pod openshift-machine-api_kubelet-killer: cannot delete GR SNAT for pod openshift-machine-api/kubelet-killer: failed create operation for deleting SNAT rule for pod on gateway router GR_huliu-nu96a-zn7mc-workera-x54mr: unable to get NAT entries for router &{UUID: Copp:<nil> Enabled:<nil> ExternalIDs:map[] LoadBalancer:[] LoadBalancerGroup:[] Name:GR_huliu-nu96a-zn7mc-workera-x54mr Nat:[] Options:map[] Policies:[] Ports:[] StaticRoutes:[]}: failed to get router: GR_huliu-nu96a-zn7mc-workera-x54mr, error: object not found
  Warning  FailedCreatePodSandBox  5m29s  kubelet       Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_kubelet-killer_openshift-machine-api_84edbe26-680b-4c50-a8a4-71ffb82b8d9c_0(c1671822d85747016e7a619891ff5981b470c268f478a761de485f3ae3a0f2ef): error adding pod openshift-machine-api_kubelet-killer to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [openshift-machine-api/kubelet-killer/84edbe26-680b-4c50-a8a4-71ffb82b8d9c:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[openshift-machine-api/kubelet-killer c1671822d85747016e7a619891ff5981b470c268f478a761de485f3ae3a0f2ef] [openshift-machine-api/kubelet-killer c1671822d85747016e7a619891ff5981b470c268f478a761de485f3ae3a0f2ef] failed to get pod annotation: timed out waiting for annotations: context deadline exceeded
'
  Warning  FailedCreatePodSandBox  3m17s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_kubelet-killer_openshift-machine-api_84edbe26-680b-4c50-a8a4-71ffb82b8d9c_0(dced805c3e86acbf5a10a8b4efbc02c64ad3c9360e23885c4fe593ca198f43b0): error adding pod openshift-machine-api_kubelet-killer to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [openshift-machine-api/kubelet-killer/84edbe26-680b-4c50-a8a4-71ffb82b8d9c:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[openshift-machine-api/kubelet-killer dced805c3e86acbf5a10a8b4efbc02c64ad3c9360e23885c4fe593ca198f43b0] [openshift-machine-api/kubelet-killer dced805c3e86acbf5a10a8b4efbc02c64ad3c9360e23885c4fe593ca198f43b0] failed to get pod annotation: timed out waiting for annotations: context deadline exceeded
'
  Warning  FailedCreatePodSandBox  65s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_kubelet-killer_openshift-machine-api_84edbe26-680b-4c50-a8a4-71ffb82b8d9c_0(4bbf45588909933b9c4086274a08b7cddc2e09fe47e740ee14c74523f4f21ef2): error adding pod openshift-machine-api_kubelet-killer to CNI network "multus-cni-network": plugin type="multus" name="multus-cni-network" failed (add): [openshift-machine-api/kubelet-killer/84edbe26-680b-4c50-a8a4-71ffb82b8d9c:ovn-kubernetes]: error adding container to network "ovn-kubernetes": CNI request failed with status 400: '[openshift-machine-api/kubelet-killer 4bbf45588909933b9c4086274a08b7cddc2e09fe47e740ee14c74523f4f21ef2] [openshift-machine-api/kubelet-killer 4bbf45588909933b9c4086274a08b7cddc2e09fe47e740ee14c74523f4f21ef2] failed to get pod annotation: timed out waiting for annotations: context deadline exceeded
'
In the Warning Events it shows “GR_huliu-nu96a-zn7mc-workera-x54mr”, but huliu-nu96a-zn7mc-workera-x54mr is the previous node, I created the pod on huliu-nu96a-zn7mc-workera-t8dj2 in Step 5.
If create the new pod with different name, there is no such issue. 
Actual results:
The pod doesn’t worked as expected when it has the same name with previous pods.
Expected results:
The pod should worked as expected even it has the same name with previous pods.
Additional info:
The same case worked as expected on SDN network cluster. Discussion in slack https://redhat-internal.slack.com/archives/CH76YSYSC/p1693983428736929
- depends on
 - 
                    
OCPBUGS-18895 Pod sometimes doesn’t work as expected when it has the same name with previous pods on OVN network cluster
-         
 - Closed
 
 -         
 
- is cloned by
 - 
                    
OCPBUGS-18672 Pod sometimes doesn’t work as expected when it has the same name with previous pods on OVN network cluster
-         
 - Closed
 
 -         
 - 
                    
OCPBUGS-18895 Pod sometimes doesn’t work as expected when it has the same name with previous pods on OVN network cluster
-         
 - Closed
 
 -         
 
- is depended on by
 - 
                    
OCPBUGS-18672 Pod sometimes doesn’t work as expected when it has the same name with previous pods on OVN network cluster
-         
 - Closed
 
 -         
 
- links to
 - 
                    
        
        RHSA-2023:5006
        OpenShift Container Platform 4.14.z security update