-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.14.z
-
None
-
False
-
Description of problem:
During the cluster upgrade from 4.13 to 4.14, the dns of the node changes from hostname.customdomain.net to hostname.ec2.internal
Version-Release number of selected component (if applicable):
4.14.36
How reproducible:
I dont have the exact steps to reproduce but we can try to uprgade a ocp cluster from version 4.13 to 4.14.36
Steps to Reproduce:
1. 2. 3.
Actual results:
Node dns changed and the node is not getting added back to cluster post reboot of the node.
Expected results:
node dns should not change and upgrade should be successful.
Additional info:
followed the workaround mentioned in bug https://issues.redhat.com/browse/OCPBUGS-29432?focusedId=24165685&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-24165685 Allow node to reboot into new Machine-config iteration update /etc/kubernetes/node.env file to reflect the CORRECT hostname (exampled below): cat /etc/kubernetes/node.env KUBELET_NODE_NAME=ip-10-131-136-36.ec2.internal 3. Restart kubelet on the host node (do not restart) and proceed to the next node. (then upgrade cluster to 4.14.11) for the fix, as it is related to this bug: https://issues.redhat.com//browse/OCPBUGS-27261 Proceed to step 4 if you encounter issues with kubelet approvals: 4. If kubelet reports that it is forbidden to contact API due to similar error messaging as below and the node stays in NOTREADY then move to step 5: Feb 14 16:26:45 ip-10-131-136-36 kubenswrapper[8121]: I0214 16:26:45.848529 8121 csi_plugin.go:913] Failed to contact API server when waiting for CSINode publishing: csinodes.storage.k8s.io "ip-10-131-136-36.ec2.internal" is forbidden: User "system:node:ip-10-131-136-36.<cusom>.local" cannot get resource "csinodes" in API group "storage.k8s.io" at the cluster scope: can only access CSINode with the same name as the requesting node 5. make a folder at /var/lib/kubelet/pki/backup and copy all contents of /var/lib/kubelet/pki/*.pem into the target folder 6. restart kubelet again and then check for csrs (you will be looking for a bootstrapper CSR, approve that, then a subsequent set of CSRs set for the node with it's proper name (2x) that both must be approved. After that the node will return to READY status.