-
Bug
-
Resolution: Not a Bug
-
Undefined
-
None
-
4.14.z
-
Incidents & Support
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
-
None
Description of problem:
I am filing this bug in Networking/OVN but this may need to move to MCO team.
During SDN to OVN migration on 4.14.31, Documentation steps were followed to move cluster into migration state, patched to define options for OVNkube-network. A new rendered-worker and rendered-master machine-config template was generated. This rendered config triggered a rollout on all host nodes, all nodes restarted and then immediately reverted configuration states back to SDN config (previous machine-config-rendered builds) I1125 18:04:24.834012 1945 update.go:1987] Disk currentConfig "rendered-worker-60f99140fb406cb9f28b9c4f5da34a2d" overrides node's currentConfig annotation "rendered-worker-8e711d6ed7a3e84d1b309ef5f9f4476b" I1125 18:04:24.835987 1945 daemon.go:1841] Validating against current config rendered-worker-60f99140fb406cb9f28b9c4f5da34a2d As a result, migration has stalled/failed - unable to proceed. We have attempted to force a rollover to the latest build unsuccessfully using the below steps: node_name=<name> new_value=rendered-worker-8e711d6ed7a3e84d1b309ef5f9f4476b oc patch node $node_name --type merge --patch "{\"metadata\": {\"annotations\": {\"machineconfiguration.openshift.io/desiredConfig\": \"${new_value}\"}}}" oc patch node $node_name --type merge --patch '{"metadata": {"annotations": {"machineconfiguration.openshift.io/reason": ""}}}' oc patch node $node_name --type merge --patch '{"metadata": {"annotations": {"machineconfiguration.openshift.io/state": "Done"}}}' #tested adding this step - no change, the file is rebuilt and then supersedes the selection of the annotation. oc debug node/$node_name -- chroot /host sh -c "mv /etc/machine-config-daemon/currentconfig /etc/machine-config-daemon/oldconfig" oc debug node/$node_name -- chroot /host sh -c "touch /run/machine-config-daemon-force"
Version-Release number of selected component (if applicable):
4.14.31
How reproducible:
every time on this customer cluster
Steps to Reproduce:
1. Cluster running 4.14.31 on Vsphere
2. Attempt migration using below steps
3. observe CNO reverts machine-config template
Actual results:
failed SDN migration
Expected results:
mcp rollout should provide alerts/warnings for rejected mcp rendered build and should also preferentially select newest/latest rendered config over previous builds.
More data in the Jira comments below.
Affected Platforms: vsphere