-
Spike
-
Resolution: Done
-
Critical
-
None
-
None
-
False
-
None
-
False
-
-
-
0
-
0
Impact of OCPBUGS-27261:
Which 4.y.z to 4.y'.z' updates increase vulnerability?
- Customers on AWS upgrading from 4.13.z to 4.14.x. No current fix.
- Tracking now in OCPBUGS-29290
Which types of clusters?
- AWS clusters that are using the custom domain setting in their VPC DHCP configuration (I don't believe there's a way to identify this from within a cluster)
- To identify if affected, must determine if using default or custom domain name in DHCP configuration (BYOVPC only, IPI doesn't support this)
- Get DHCP options ID:
aws ec2 describe-vpcs --vpc-id <vpc-id> --region <region> | jq -r '.Vpcs[].DhcpOptionsId'
- Get DHCP options ID:
-
- Get domain name values:
aws ec2 describe-dhcp-options --dhcp-options-ids <dhcp-opt-id> --region <region> | jq -r '.DhcpOptions[].DhcpConfigurations[] | select(.Key == "domain-name") | .Values[]'
- Get domain name values:
-
- If there are multiple values or the value is not .ec2.internal or .<region>.ec2.internal then the cluster is affected
What is the impact? Is it serious enough to warrant removing update recommendations?
- Node names are changed after a reboot (which happens during upgrade). This means new credentials and a new, redundant Node object is created for the same, already existing node. This fails at the CSR approval stage.
- Nodes go into an unready state until the node is manually fixed, and it will re-break if the node is restarted at some later date before an upgrade to a fixed version.
How involved is remediation?
- Workaround explained in https://issues.redhat.com/browse/OCPBUGS-29432?focusedId=24165685&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-24165685
- SSH into each node, update a file with the correct node name, restart kubelet, do not reboot node until 4.14.11 upgrade or later
- KB Article: OpenShift Container Platform 4.14 upgrade is stuck when dhcp option domain-name is used in AWS
Is this a regression?
- Yes
- blocks
-
OCPBUGS-27261 Environment file /etc/kubernetes/node.env is overwritten after a node restart
- Closed
- links to