-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.14.z, 4.18
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
CLOUD Sprint 261
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
[AWS] cluster upgrade failed when using dhcp option with upper case domain-name
Version-Release number of selected component (if applicable):
I tested 4.13.0-0.nightly-2024-10-10-221519 -> 4.14.0-0.nightly-2024-10-11-181710 and 4.13.48-x86_64 -> 4.14.36-x86_64 and 4.13.0-0.nightly-2024-10-10-221519 -> 4.14.0-0.ci.test-2024-10-14-013447-ci-ln-75t1mmb-latest
How reproducible:
Always
Steps to Reproduce:
1.Create a dhcp option with upper case domain-name
liuhuali@Lius-MacBook-Pro huali-test % aws ec2 create-dhcp-options --dhcp-configurations '[{"Key":"domain-name-servers","Values":["AmazonProvidedDNS"]},{"Key":"domain-name","Values":["HUALI-Qe.exampleA.com"]}]'
DHCPOPTIONS dopt-085f8c5f0eb6dae21 301721915996
DHCPCONFIGURATIONS domain-name
VALUES HUALI-Qe.exampleA.com
DHCPCONFIGURATIONS domain-name-servers
VALUES AmazonProvidedDNS
2.Install an AWS cluster, I use automated template:
ipi-on-aws/versioned-installer-ci
liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.13.0-0.nightly-2024-10-10-221519 True False 26m Cluster version is 4.13.0-0.nightly-2024-10-10-221519
3.Swap the VPC to the dhcp created in the first step on AWS console
4.Create a new machineset or scale a machineset
liuhuali@Lius-MacBook-Pro huali-test % oc create -f ms1.yaml
machineset.machine.openshift.io/huliu-aws1012d-snmnz-worker-us-east-2aa created
liuhuali@Lius-MacBook-Pro huali-test % oc get machine
NAME PHASE TYPE REGION ZONE AGE
huliu-aws1012d-snmnz-master-0 Running m6i.xlarge us-east-2 us-east-2a 56m
huliu-aws1012d-snmnz-master-1 Running m6i.xlarge us-east-2 us-east-2b 56m
huliu-aws1012d-snmnz-master-2 Running m6i.xlarge us-east-2 us-east-2c 56m
huliu-aws1012d-snmnz-worker-us-east-2a-ksnlj Running m6i.xlarge us-east-2 us-east-2a 52m
huliu-aws1012d-snmnz-worker-us-east-2aa-qxsgk Running m6i.xlarge us-east-2 us-east-2a 10m
huliu-aws1012d-snmnz-worker-us-east-2b-88pjf Running m6i.xlarge us-east-2 us-east-2b 52m
huliu-aws1012d-snmnz-worker-us-east-2c-prp5h Running m6i.xlarge us-east-2 us-east-2c 52m
liuhuali@Lius-MacBook-Pro huali-test % oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-135-249.us-east-2.compute.internal Ready control-plane,master 55m v1.26.15+53fd427
ip-10-0-143-246.us-east-2.compute.internal Ready worker 7m55s v1.26.15+53fd427
ip-10-0-146-224.us-east-2.compute.internal Ready worker 49m v1.26.15+53fd427
ip-10-0-168-215.us-east-2.compute.internal Ready worker 49m v1.26.15+53fd427
ip-10-0-189-48.us-east-2.compute.internal Ready control-plane,master 55m v1.26.15+53fd427
ip-10-0-197-11.us-east-2.compute.internal Ready control-plane,master 55m v1.26.15+53fd427
ip-10-0-203-123.us-east-2.compute.internal Ready worker 49m v1.26.15+53fd427
liuhuali@Lius-MacBook-Pro huali-test % oc get machine huliu-aws1012d-snmnz-worker-us-east-2aa-qxsgk -oyaml
...
status:
addresses:
- address: 10.0.143.246
type: InternalIP
- address: ip-10-0-143-246.us-east-2.compute.internal
type: InternalDNS
- address: ip-10-0-143-246.us-east-2.compute.internal
type: Hostname
- address: ip-10-0-143-246.HUALI-Qe.exampleA.com
type: InternalDNS
...
5.Upgrade the cluster, the cluster stuck on cloud-credential
liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.13.0-0.nightly-2024-10-10-221519 True True 78m Unable to apply 4.14.0-0.nightly-2024-10-11-181710: wait has exceeded 40 minutes for these operators: cloud-credential
liuhuali@Lius-MacBook-Pro huali-test % oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.14.0-0.nightly-2024-10-11-181710 True False False 128m
baremetal 4.14.0-0.nightly-2024-10-11-181710 True False False 141m
cloud-controller-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 144m
cloud-credential 4.14.0-0.nightly-2024-10-11-181710 True True True 144m 6 of 6 credentials requests are failing to sync.
cluster-autoscaler 4.14.0-0.nightly-2024-10-11-181710 True False False 141m
config-operator 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
console 4.14.0-0.nightly-2024-10-11-181710 True False False 130m
control-plane-machine-set 4.14.0-0.nightly-2024-10-11-181710 True False False 136m
csi-snapshot-controller 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
dns 4.13.0-0.nightly-2024-10-10-221519 True False False 141m
etcd 4.14.0-0.nightly-2024-10-11-181710 True False False 141m
image-registry 4.14.0-0.nightly-2024-10-11-181710 True False False 135m
ingress 4.14.0-0.nightly-2024-10-11-181710 True False False 136m
insights 4.14.0-0.nightly-2024-10-11-181710 True False False 136m
kube-apiserver 4.14.0-0.nightly-2024-10-11-181710 True False False 131m
kube-controller-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 139m
kube-scheduler 4.14.0-0.nightly-2024-10-11-181710 True False False 139m
kube-storage-version-migrator 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
machine-api 4.14.0-0.nightly-2024-10-11-181710 True False False 138m
machine-approver 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
machine-config 4.13.0-0.nightly-2024-10-10-221519 True False False 141m
marketplace 4.14.0-0.nightly-2024-10-11-181710 True False False 141m
monitoring 4.14.0-0.nightly-2024-10-11-181710 True False False 135m
network 4.13.0-0.nightly-2024-10-10-221519 True False False 143m
node-tuning 4.14.0-0.nightly-2024-10-11-181710 True False False 51m
openshift-apiserver 4.14.0-0.nightly-2024-10-11-181710 True False False 131m
openshift-controller-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 138m
openshift-samples 4.14.0-0.nightly-2024-10-11-181710 True False False 52m
operator-lifecycle-manager 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
operator-lifecycle-manager-catalog 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
operator-lifecycle-manager-packageserver 4.14.0-0.nightly-2024-10-11-181710 True False False 52m
service-ca 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
storage 4.14.0-0.nightly-2024-10-11-181710 True False False 142m
liuhuali@Lius-MacBook-Pro huali-test % oc logs cloud-credential-operator-567bc97fb4-4lfq5 -n openshift-cloud-credential-operator -c cloud-credential-operator
...
time="2024-10-12T10:42:59Z" level=error msg="RequestError: send request failed\ncaused by: Post \"https://iam.amazonaws.com/\": dial tcp 18.119.154.66:443: i/o timeout"
time="2024-10-12T10:42:59Z" level=error msg="error determining whether a credentials update is needed" actuator=aws cr=openshift-cloud-credential-operator/openshift-machine-api-aws error="AWS Error: RequestError: send request failed\ncaused by: Post \"https://iam.amazonaws.com/\": dial tcp 18.119.154.66:443: i/o timeout"
time="2024-10-12T10:42:59Z" level=error msg="error syncing credentials: error determining whether a credentials update is needed" controller=credreq cr=openshift-cloud-credential-operator/openshift-machine-api-aws secret=openshift-machine-api/aws-cloud-credentials
time="2024-10-12T10:42:59Z" level=error msg="errored with condition: CredentialsProvisionFailure" controller=credreq cr=openshift-cloud-credential-operator/openshift-machine-api-aws secret=openshift-machine-api/aws-cloud-credentials
time="2024-10-12T10:42:59Z" level=info msg="syncing credentials request" controller=credreq cr=openshift-cloud-credential-operator/aws-ebs-csi-driver-operator
Actual results:
Upgrade failed
Expected results:
Upgrade succeed
Additional info:
Must-gather https://drive.google.com/file/d/1v--I5ghJvVBVvnW9hwW3DfAxiHo-G33G/view?usp=sharing
- is blocked by
-
OCPCLOUD-2774 Impact statement request for OCPBUGS-43274 [AWS] cluster upgrade or install failed when using dhcp option with some domain-name
-
- Closed
-