-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.14.z, 4.18
-
Moderate
-
None
-
CLOUD Sprint 261
-
1
-
False
-
Description of problem:
[AWS] cluster upgrade failed when using dhcp option with upper case domain-name
Version-Release number of selected component (if applicable):
I tested 4.13.0-0.nightly-2024-10-10-221519 -> 4.14.0-0.nightly-2024-10-11-181710 and 4.13.48-x86_64 -> 4.14.36-x86_64 and 4.13.0-0.nightly-2024-10-10-221519 -> 4.14.0-0.ci.test-2024-10-14-013447-ci-ln-75t1mmb-latest
How reproducible:
Always
Steps to Reproduce:
1.Create a dhcp option with upper case domain-name liuhuali@Lius-MacBook-Pro huali-test % aws ec2 create-dhcp-options --dhcp-configurations '[{"Key":"domain-name-servers","Values":["AmazonProvidedDNS"]},{"Key":"domain-name","Values":["HUALI-Qe.exampleA.com"]}]' DHCPOPTIONS dopt-085f8c5f0eb6dae21 301721915996 DHCPCONFIGURATIONS domain-name VALUES HUALI-Qe.exampleA.com DHCPCONFIGURATIONS domain-name-servers VALUES AmazonProvidedDNS 2.Install an AWS cluster, I use automated template: ipi-on-aws/versioned-installer-ci liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.0-0.nightly-2024-10-10-221519 True False 26m Cluster version is 4.13.0-0.nightly-2024-10-10-221519 3.Swap the VPC to the dhcp created in the first step on AWS console 4.Create a new machineset or scale a machineset liuhuali@Lius-MacBook-Pro huali-test % oc create -f ms1.yaml machineset.machine.openshift.io/huliu-aws1012d-snmnz-worker-us-east-2aa created liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-aws1012d-snmnz-master-0 Running m6i.xlarge us-east-2 us-east-2a 56m huliu-aws1012d-snmnz-master-1 Running m6i.xlarge us-east-2 us-east-2b 56m huliu-aws1012d-snmnz-master-2 Running m6i.xlarge us-east-2 us-east-2c 56m huliu-aws1012d-snmnz-worker-us-east-2a-ksnlj Running m6i.xlarge us-east-2 us-east-2a 52m huliu-aws1012d-snmnz-worker-us-east-2aa-qxsgk Running m6i.xlarge us-east-2 us-east-2a 10m huliu-aws1012d-snmnz-worker-us-east-2b-88pjf Running m6i.xlarge us-east-2 us-east-2b 52m huliu-aws1012d-snmnz-worker-us-east-2c-prp5h Running m6i.xlarge us-east-2 us-east-2c 52m liuhuali@Lius-MacBook-Pro huali-test % oc get node NAME STATUS ROLES AGE VERSION ip-10-0-135-249.us-east-2.compute.internal Ready control-plane,master 55m v1.26.15+53fd427 ip-10-0-143-246.us-east-2.compute.internal Ready worker 7m55s v1.26.15+53fd427 ip-10-0-146-224.us-east-2.compute.internal Ready worker 49m v1.26.15+53fd427 ip-10-0-168-215.us-east-2.compute.internal Ready worker 49m v1.26.15+53fd427 ip-10-0-189-48.us-east-2.compute.internal Ready control-plane,master 55m v1.26.15+53fd427 ip-10-0-197-11.us-east-2.compute.internal Ready control-plane,master 55m v1.26.15+53fd427 ip-10-0-203-123.us-east-2.compute.internal Ready worker 49m v1.26.15+53fd427 liuhuali@Lius-MacBook-Pro huali-test % oc get machine huliu-aws1012d-snmnz-worker-us-east-2aa-qxsgk -oyaml ... status: addresses: - address: 10.0.143.246 type: InternalIP - address: ip-10-0-143-246.us-east-2.compute.internal type: InternalDNS - address: ip-10-0-143-246.us-east-2.compute.internal type: Hostname - address: ip-10-0-143-246.HUALI-Qe.exampleA.com type: InternalDNS ... 5.Upgrade the cluster, the cluster stuck on cloud-credential liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.0-0.nightly-2024-10-10-221519 True True 78m Unable to apply 4.14.0-0.nightly-2024-10-11-181710: wait has exceeded 40 minutes for these operators: cloud-credential liuhuali@Lius-MacBook-Pro huali-test % oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.14.0-0.nightly-2024-10-11-181710 True False False 128m baremetal 4.14.0-0.nightly-2024-10-11-181710 True False False 141m cloud-controller-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 144m cloud-credential 4.14.0-0.nightly-2024-10-11-181710 True True True 144m 6 of 6 credentials requests are failing to sync. cluster-autoscaler 4.14.0-0.nightly-2024-10-11-181710 True False False 141m config-operator 4.14.0-0.nightly-2024-10-11-181710 True False False 142m console 4.14.0-0.nightly-2024-10-11-181710 True False False 130m control-plane-machine-set 4.14.0-0.nightly-2024-10-11-181710 True False False 136m csi-snapshot-controller 4.14.0-0.nightly-2024-10-11-181710 True False False 142m dns 4.13.0-0.nightly-2024-10-10-221519 True False False 141m etcd 4.14.0-0.nightly-2024-10-11-181710 True False False 141m image-registry 4.14.0-0.nightly-2024-10-11-181710 True False False 135m ingress 4.14.0-0.nightly-2024-10-11-181710 True False False 136m insights 4.14.0-0.nightly-2024-10-11-181710 True False False 136m kube-apiserver 4.14.0-0.nightly-2024-10-11-181710 True False False 131m kube-controller-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 139m kube-scheduler 4.14.0-0.nightly-2024-10-11-181710 True False False 139m kube-storage-version-migrator 4.14.0-0.nightly-2024-10-11-181710 True False False 142m machine-api 4.14.0-0.nightly-2024-10-11-181710 True False False 138m machine-approver 4.14.0-0.nightly-2024-10-11-181710 True False False 142m machine-config 4.13.0-0.nightly-2024-10-10-221519 True False False 141m marketplace 4.14.0-0.nightly-2024-10-11-181710 True False False 141m monitoring 4.14.0-0.nightly-2024-10-11-181710 True False False 135m network 4.13.0-0.nightly-2024-10-10-221519 True False False 143m node-tuning 4.14.0-0.nightly-2024-10-11-181710 True False False 51m openshift-apiserver 4.14.0-0.nightly-2024-10-11-181710 True False False 131m openshift-controller-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 138m openshift-samples 4.14.0-0.nightly-2024-10-11-181710 True False False 52m operator-lifecycle-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 142m operator-lifecycle-manager-catalog 4.14.0-0.nightly-2024-10-11-181710 True False False 142m operator-lifecycle-manager-packageserver 4.14.0-0.nightly-2024-10-11-181710 True False False 52m service-ca 4.14.0-0.nightly-2024-10-11-181710 True False False 142m storage 4.14.0-0.nightly-2024-10-11-181710 True False False 142m liuhuali@Lius-MacBook-Pro huali-test % oc logs cloud-credential-operator-567bc97fb4-4lfq5 -n openshift-cloud-credential-operator -c cloud-credential-operator ... time="2024-10-12T10:42:59Z" level=error msg="RequestError: send request failed\ncaused by: Post \"https://iam.amazonaws.com/\": dial tcp 18.119.154.66:443: i/o timeout" time="2024-10-12T10:42:59Z" level=error msg="error determining whether a credentials update is needed" actuator=aws cr=openshift-cloud-credential-operator/openshift-machine-api-aws error="AWS Error: RequestError: send request failed\ncaused by: Post \"https://iam.amazonaws.com/\": dial tcp 18.119.154.66:443: i/o timeout" time="2024-10-12T10:42:59Z" level=error msg="error syncing credentials: error determining whether a credentials update is needed" controller=credreq cr=openshift-cloud-credential-operator/openshift-machine-api-aws secret=openshift-machine-api/aws-cloud-credentials time="2024-10-12T10:42:59Z" level=error msg="errored with condition: CredentialsProvisionFailure" controller=credreq cr=openshift-cloud-credential-operator/openshift-machine-api-aws secret=openshift-machine-api/aws-cloud-credentials time="2024-10-12T10:42:59Z" level=info msg="syncing credentials request" controller=credreq cr=openshift-cloud-credential-operator/aws-ebs-csi-driver-operator
Actual results:
Upgrade failed
Expected results:
Upgrade succeed
Additional info:
Must-gather https://drive.google.com/file/d/1v--I5ghJvVBVvnW9hwW3DfAxiHo-G33G/view?usp=sharing
- is blocked by
-
OCPCLOUD-2774 Impact statement request for OCPBUGS-43274 [AWS] cluster upgrade or install failed when using dhcp option with some domain-name
- Closed