Details
-
Bug
-
Resolution: Obsolete
-
Undefined
-
None
-
4.16
-
None
-
Important
-
Yes
-
CLOUD Sprint 251
-
1
-
Proposed
-
False
-
Description
Description of problem:
CAPI machines stuck in Pending on AWS
Version-Release number of selected component (if applicable):
4.16.0-0.nightly-2024-03-13-061822 Before I tested on 4.16.0-0.nightly-2024-03-09-163353 it worked, refer https://issues.redhat.com/browse/OCPCLOUD-2441
How reproducible:
Always
Steps to Reproduce:
1.Create an aws tech preview cluster, we use automated template: ipi-on-aws/versioned-installer-techpreview-ci liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.16.0-0.nightly-2024-03-13-061822 True False 17m Cluster version is 4.16.0-0.nightly-2024-03-13-061822 2.Create cluster, awscluster, awsmachinetemplate, capi MachineSet liuhuali@Lius-MacBook-Pro huali-test % oc get cluster NAME CLUSTERCLASS PHASE AGE VERSION huliu-aws319a-fg8jx Provisioned 21m liuhuali@Lius-MacBook-Pro huali-test % oc get awscluster NAME CLUSTER READY VPC BASTION IP huliu-aws319a-fg8jx huliu-aws319a-fg8jx true liuhuali@Lius-MacBook-Pro huali-test % oc get awsmachinetemplate NAME AGE aws-machinetemplate 21m liuhuali@Lius-MacBook-Pro huali-test % oc get machineset.cluster.x-k8s.io NAME CLUSTER REPLICAS READY AVAILABLE AGE VERSION capi-machineset-51071 huliu-aws319a-fg8jx 1 21m liuhuali@Lius-MacBook-Pro huali-test % oc get machines.cluster.x-k8s.io NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION capi-machineset-51071-4dpbh huliu-aws319a-fg8jx Pending 21m liuhuali@Lius-MacBook-Pro huali-test % oc get machines.cluster.x-k8s.io capi-machineset-51071-4dpbh -oyaml apiVersion: cluster.x-k8s.io/v1beta1 kind: Machine metadata: creationTimestamp: "2024-03-19T01:47:14Z" finalizers: - machine.cluster.x-k8s.io generation: 1 labels: cluster.x-k8s.io/cluster-name: huliu-aws319a-fg8jx cluster.x-k8s.io/set-name: capi-machineset-51071 machine.openshift.io/cluster-api-cluster: huliu-aws319a-fg8jx name: capi-machineset-51071-4dpbh namespace: openshift-cluster-api ownerReferences: - apiVersion: cluster.x-k8s.io/v1beta1 blockOwnerDeletion: true controller: true kind: MachineSet name: capi-machineset-51071 uid: 0697b819-a881-4549-bfdb-5df69043301c resourceVersion: "47110" uid: 85ff9834-1458-42a7-b2ee-33bac2026810 spec: bootstrap: dataSecretName: worker-user-data clusterName: huliu-aws319a-fg8jx infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 kind: AWSMachine name: aws-machinetemplate-cq5gk namespace: openshift-cluster-api uid: 89e0acfa-0e36-445d-9104-91fc67450c25 nodeDeletionTimeout: 10s status: conditions: - lastTransitionTime: "2024-03-19T01:47:34Z" message: 0 of 2 completed reason: InstanceProvisionFailed severity: Error status: "False" type: Ready - lastTransitionTime: "2024-03-19T01:47:34Z" message: 0 of 2 completed reason: InstanceProvisionFailed severity: Error status: "False" type: InfrastructureReady - lastTransitionTime: "2024-03-19T01:47:14Z" reason: WaitingForNodeRef severity: Info status: "False" type: NodeHealthy lastUpdated: "2024-03-19T01:47:14Z" observedGeneration: 1 phase: Pending liuhuali@Lius-MacBook-Pro huali-test % liuhuali@Lius-MacBook-Pro huali-test % oc get pod NAME READY STATUS RESTARTS AGE capa-controller-manager-686c794b55-qv97p 1/1 Running 7 (51m ago) 73m capi-controller-manager-6bfff8b86d-7hqwk 1/1 Running 7 (51m ago) 73m cluster-capi-operator-6446bb5f97-9gqk8 1/1 Running 3 (58m ago) 75m liuhuali@Lius-MacBook-Pro huali-test % oc logs capa-controller-manager-686c794b55-qv97p … I0319 02:04:52.654851 1 awscontrolleridentity_controller.go:87] "IdentityRef is nil, skipping reconciliation" controller="awscluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSCluster" AWSCluster="openshift-cluster-api/huliu-aws319a-fg8jx" namespace="openshift-cluster-api" name="huliu-aws319a-fg8jx" reconcileID="3bdb043c-0563-4a18-b148-0f14ac0c9570" cluster="openshift-cluster-api/huliu-aws319a-fg8jx" I0319 02:04:52.729568 1 awsmachine_controller.go:680] "Creating EC2 instance" E0319 02:04:52.729637 1 awsmachine_controller.go:520] "unable to create instance" err="failed to resolve userdata: creating userdata object: requested object creation but bucket management is not enabled" E0319 02:04:52.730014 1 controller.go:329] "Reconciler error" err="failed to resolve userdata: creating userdata object: requested object creation but bucket management is not enabled" controller="awsmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSMachine" AWSMachine="openshift-cluster-api/aws-machinetemplate-cq5gk" namespace="openshift-cluster-api" name="aws-machinetemplate-cq5gk" reconcileID="860c860a-188f-4563-a86e-37d489552e65"
Actual results:
CAPI Machine stuck in Pending
Expected results:
CAPI Machine should get Running
Additional info:
must-gather: https://drive.google.com/file/d/1LS8z8an10rggCHuSx_kWTkP0mhYxSmsA/view?usp=sharing