-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
4.15.z
-
No
-
SDN Sprint 259, SDN Sprint 260, SDN Sprint 261, SDN Sprint 262, SDN Sprint 263, SDN Sprint 264, SDN Sprint 265
-
7
-
False
-
Description of problem:
In an UPI installed OCP, after recovering from etcd backup, ovnkube-node pod which located in the lost control plane host in CrashLoopBackOff state
- oc get po -o wide -n openshift-ovn-kubernetes
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ovnkube-control-plane-5cdd496c9b-dbxq2 2/2 Running 0 40m 192.168.1.104 master-2.ocp4.example.com <none> <none>
ovnkube-control-plane-5cdd496c9b-wb5bn 2/2 Running 0 40m 192.168.1.103 master-1.ocp4.example.com <none> <none>
ovnkube-node-4cltd 8/8 Running 1 (32m ago) 32m 192.168.1.106 worker-1.ocp4.example.com <none> <none>
ovnkube-node-6dvlh 8/8 Running 1 (33m ago) 33m 192.168.1.105 worker-0.ocp4.example.com <none> <none>
ovnkube-node-6jvsj 8/8 Running 1 (38m ago) 38m 192.168.1.102 master-0.ocp4.example.com <none> <none>
ovnkube-node-8mvjw 7/8 CrashLoopBackOff 8 (20s ago) 35m 192.168.1.104 master-2.ocp4.example.com <none> <none>
ovnkube-node-z2kv8 7/8 CrashLoopBackOff 12 (2m1s ago) 36m 192.168.1.103 master-1.ocp4.example.com <none> <none>
logs in the ovnkube-node-8mvjw pod:
2024-06-13T03:25:14.487Z|00218|ovsdb_idl|WARN|transaction error:
{"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"192.168.1.104\") for index on columns \"type\" and \"ip\". First row, with UUID 738f583e-5dde-4ed8-a447-cd9af95a0d53, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 4e73bcf3-49b8-44ab-be82-d8c3a78da4bb, was inserted by this transaction.","error":"constraint violation"}Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
2. Restore 2 master nodes according to below doc:
https://docs.openshift.com/container-platform/4.15/backup_and_restore/control_plane_backup_and_restore/disaster_recovery/scenario-2-restoring-cluster-state.html
Actual results:
ovnkube-node pods which located in the lost control plane host in CrashLoopBackOff state
Expected results:
Cluster can be restored successfully
Additional info:
Issue doesn't happened when the new master node's IP changed. Tested in AWS IPI env.
- is depended on by
-
OCPSTRAT-989 Backup/restore for Hosted Clusters for Self-Managed HCP Part I
- Closed