-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.17
-
Moderate
-
None
-
False
-
Description of problem:
CAPI machine in Local Zones stuck in Pending, capa logs reports “parameter groupId is invalid. The value cannot be empty”
Version-Release number of selected component (if applicable):
4.17.0-0.nightly-2024-09-26-185948
How reproducible:
always
Steps to Reproduce:
1.nstall an AWS local zone or wavelength_zone cluster, we have automated template: versioned-installer-customer_vpc-ovn-local_zone versioned-installer-customer_vpc-ovn-local_zone-ci versioned-installer-customer_vpc-ovn-local_zone_day2 versioned-installer-customer_vpc-ovn-wavelength_zone versioned-installer-customer_vpc-ovn-wavelength_zone_day2 with feature_set: "TechPreviewNoUpgrade" Here I use a prow job aws-ipi-localzone-byo-subnet-ovn-day2-f28-destructive then enable feature gate liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.17.0-0.nightly-2024-09-26-185948 True False 130m Cluster version is 4.17.0-0.nightly-2024-09-26-185948 2.Create cluster liuhuali@Lius-MacBook-Pro huali-test % oc create -f my-cluster926.yaml cluster.cluster.x-k8s.io/ci-op-rwbcqck1-1e8db-4wgn9 created liuhuali@Lius-MacBook-Pro huali-test % cat my-cluster926.yaml apiVersion: cluster.x-k8s.io/v1beta1 kind: Cluster metadata: name: ci-op-rwbcqck1-1e8db-4wgn9 namespace: openshift-cluster-api spec: infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: AWSCluster name: ci-op-rwbcqck1-1e8db-4wgn9 namespace: openshift-cluster-api 3.Edit AWSCluster to add subnets under network network: subnets: - id: subnet-00b4125d7daff7135 isPublic: true 4.create awsmachinetemplate liuhuali@Lius-MacBook-Pro huali-test % oc create -f awsmachinetemplate926.yaml awsmachinetemplate.infrastructure.cluster.x-k8s.io/aws-machinetemplate created liuhuali@Lius-MacBook-Pro huali-test % oc get awsmachinetemplate NAME AGE aws-machinetemplate 8s liuhuali@Lius-MacBook-Pro huali-test % cat awsmachinetemplate926.yaml apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 kind: AWSMachineTemplate metadata: name: aws-machinetemplate namespace: openshift-cluster-api spec: template: spec: uncompressedUserData: true iamInstanceProfile: ci-op-rwbcqck1-1e8db-4wgn9-worker-profile instanceType: c5.2xlarge failureDomain: ap-northeast-1-tpe-1a ignition: storageType: UnencryptedUserData version: "3.2" ami: id: ami-0d7d4b329e5403cfb additionalSecurityGroups: - filters: - name: tag:Name values: - ci-op-rwbcqck1-1e8db-4wgn9-node - filters: - name: tag:Name values: - ci-op-rwbcqck1-1e8db-4wgn9-lb subnet: id: subnet-00b4125d7daff7135 publicIP: true 5.create capi machineset liuhuali@Lius-MacBook-Pro huali-test % oc create -f capimachineset926.yaml machineset.cluster.x-k8s.io/capi-machineset created liuhuali@Lius-MacBook-Pro huali-test % cat capimachineset926.yaml apiVersion: cluster.x-k8s.io/v1beta1 kind: MachineSet metadata: labels: cluster.x-k8s.io/cluster-name: ci-op-rwbcqck1-1e8db-4wgn9 name: capi-machineset namespace: openshift-cluster-api spec: clusterName: ci-op-rwbcqck1-1e8db-4wgn9 deletePolicy: Newest replicas: 1 selector: matchLabels: cluster.x-k8s.io/cluster-name: ci-op-rwbcqck1-1e8db-4wgn9 machine.openshift.io/cluster-api-cluster: ci-op-rwbcqck1-1e8db-4wgn9 template: metadata: labels: cluster.x-k8s.io/cluster-name: ci-op-rwbcqck1-1e8db-4wgn9 machine.openshift.io/cluster-api-cluster: ci-op-rwbcqck1-1e8db-4wgn9 spec: bootstrap: dataSecretName: worker-user-data clusterName: ci-op-rwbcqck1-1e8db-4wgn9 infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 kind: AWSMachineTemplate name: aws-machinetemplate 6.Found the machine stuck in pending liuhuali@Lius-MacBook-Pro huali-test % oc get machine.c NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION capi-machineset-fwkl5 ci-op-rwbcqck1-1e8db-4wgn9 Pending 15m liuhuali@Lius-MacBook-Pro huali-test % oc logs capa-controller-manager-757bc857fc-dk2lt … I0927 07:51:40.189091 1 awsmachine_controller.go:710] "Creating EC2 instance" E0927 07:51:40.537149 1 awsmachine_controller.go:529] "unable to create instance" err=< failed to create AWSMachine instance: failed to run instance: InvalidParameterValue: Value () for parameter groupId is invalid. The value cannot be empty status code: 400, request id: b998d3c9-960e-497e-af7b-631b20e37c38 > E0927 07:51:40.554513 1 controller.go:329] "Reconciler error" err=< failed to create AWSMachine instance: failed to run instance: InvalidParameterValue: Value () for parameter groupId is invalid. The value cannot be empty status code: 400, request id: b998d3c9-960e-497e-af7b-631b20e37c38 > controller="awsmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSMachine" AWSMachine="openshift-cluster-api/capi-machineset-fwkl5" namespace="openshift-cluster-api" name="capi-machineset-fwkl5" reconcileID="3e65be90-0974-4102-9ec5-37716639c39b" Not sure what’s groupId meaning, didn’t find that in awscluster and awsmachinetemplate crd.
Actual results:
CAPI machine in Local Zones stuck in Pending
Expected results:
CAPI machine in Local Zones should get Running
Additional info:
Must-gather https://drive.google.com/file/d/1jiNMfB1FfGdDjHoS05zQS7nFGQOpNBaP/view?usp=sharing