-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.21
-
None
-
None
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
awsmachine go into re-create loop and machine report sync error when securityGroups or subnet contain id
Version-Release number of selected component (if applicable):
4.21.0-0.nightly-2025-10-15-162146
How reproducible:
always
Steps to Reproduce:
1.Install an AWS private (its securityGroups and subnet contain id) techpreview cluster, we use automated template ipi-on-aws/versioned-installer-private_cluster-ci with parameter feature_set: "TechPreviewNoUpgrade", the cluster install successfully 2.Observed some sync error in machine, and observed awsmachine go into re-create loop liuhuali@Lius-MacBook-Pro huali-test % oc get machine -n openshift-machine-api -oyaml ... - apiVersion: machine.openshift.io/v1beta1 kind: Machine metadata: annotations: machine.openshift.io/instance-state: running creationTimestamp: "2025-10-17T04:13:43Z" finalizers: - sync.machine.openshift.io/finalizer - machine.machine.openshift.io generateName: huliu-aws1017c-zkdqq-worker-us-east-2a- generation: 2 labels: machine.openshift.io/cluster-api-cluster: huliu-aws1017c-zkdqq machine.openshift.io/cluster-api-machine-role: worker machine.openshift.io/cluster-api-machine-type: worker machine.openshift.io/cluster-api-machineset: huliu-aws1017c-zkdqq-worker-us-east-2a machine.openshift.io/instance-type: m6i.xlarge machine.openshift.io/region: us-east-2 machine.openshift.io/zone: us-east-2a name: huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz namespace: openshift-machine-api ownerReferences: - apiVersion: machine.openshift.io/v1beta1 blockOwnerDeletion: true controller: true kind: MachineSet name: huliu-aws1017c-zkdqq-worker-us-east-2a uid: 49582657-0521-4c38-9191-a78707e3377e resourceVersion: "199579" uid: a39dadcc-d2ad-467e-888c-dd106a063ecb spec: authoritativeAPI: MachineAPI lifecycleHooks: {} metadata: {} providerID: aws:///us-east-2a/i-0fe7455cb532722bc providerSpec: value: ami: id: ami-082a55a580d5538ed apiVersion: machine.openshift.io/v1beta1 blockDevices: - ebs: encrypted: true iops: 0 kmsKey: arn: "" volumeSize: 120 volumeType: gp3 capacityReservationId: "" credentialsSecret: name: aws-cloud-credentials deviceIndex: 0 iamInstanceProfile: id: huliu-aws1017c-zkdqq-worker-profile instanceType: m6i.xlarge kind: AWSMachineProviderConfig metadata: creationTimestamp: null metadataServiceOptions: {} placement: availabilityZone: us-east-2a region: us-east-2 securityGroups: - filters: - name: tag:Name values: - huliu-aws1017c-zkdqq-node - filters: - name: tag:Name values: - huliu-aws1017c-zkdqq-lb - id: sg-0b5ca4c09a70e5d09 subnet: id: subnet-08b46039fcd2c66bc tags: - name: kubernetes.io/cluster/huliu-aws1017c-zkdqq value: owned userDataSecret: name: worker-user-data status: addresses: - address: 10.0.50.13 type: InternalIP - address: ip-10-0-50-13.us-east-2.compute.internal type: InternalDNS - address: ip-10-0-50-13.us-east-2.compute.internal type: Hostname authoritativeAPI: MachineAPI conditions: - lastTransitionTime: "2025-10-17T04:14:08Z" status: "True" type: Drainable - lastTransitionTime: "2025-10-17T04:14:22Z" status: "True" type: InstanceExists - lastTransitionTime: "2025-10-17T04:14:08Z" message: The AuthoritativeAPI status is set to 'MachineAPI' reason: AuthoritativeAPIMachineAPI severity: Info status: "False" type: Paused - lastTransitionTime: "2025-10-17T05:48:21Z" message: 'failed to remove finalizer for deleting Cluster API infra machine: Operation cannot be fulfilled on awsmachines.infrastructure.cluster.x-k8s.io "huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz": the object has been modified; please apply your changes to the latest version and try again' reason: FailedToUpdateCAPIInfraMachine severity: Error status: "False" type: Synchronized - lastTransitionTime: "2025-10-17T04:14:08Z" status: "True" type: Terminable lastUpdated: "2025-10-17T05:48:20Z" nodeRef: kind: Node name: ip-10-0-50-13.us-east-2.compute.internal uid: f8a71d2a-7774-42e7-9251-f68a0f7d23c9 phase: Running providerStatus: conditions: - lastTransitionTime: "2025-10-17T04:14:15Z" message: Machine successfully created reason: MachineCreationSucceeded status: "True" type: MachineCreation instanceId: i-0fe7455cb532722bc instanceState: running synchronizedGeneration: 2 ... liuhuali@Lius-MacBook-Pro huali-test % oc get awsmachine -n openshift-cluster-api NAME CLUSTER STATE READY INSTANCEID MACHINE huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq aws:///us-east-2a/i-0fe7455cb532722bc huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq aws:///us-east-2a/i-020b35563810f5fd2 huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq-worker-us-east-2b-2l8vn huliu-aws1017c-zkdqq aws:///us-east-2b/i-0d1ff395da901eeb2 huliu-aws1017c-zkdqq-worker-us-east-2b-2l8vn liuhuali@Lius-MacBook-Pro huali-test % oc get awsmachine -n openshift-cluster-api NAME CLUSTER STATE READY INSTANCEID MACHINE huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq aws:///us-east-2a/i-0fe7455cb532722bc huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq aws:///us-east-2a/i-020b35563810f5fd2 huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn liuhuali@Lius-MacBook-Pro huali-test % oc get awsmachine -n openshift-cluster-api NAME CLUSTER STATE READY INSTANCEID MACHINE huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq aws:///us-east-2a/i-020b35563810f5fd2 huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq-worker-us-east-2b-2l8vn huliu-aws1017c-zkdqq aws:///us-east-2b/i-0d1ff395da901eeb2 huliu-aws1017c-zkdqq-worker-us-east-2b-2l8vn liuhuali@Lius-MacBook-Pro huali-test % oc get awsmachine -n openshift-cluster-api NAME CLUSTER STATE READY INSTANCEID MACHINE huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq aws:///us-east-2a/i-0fe7455cb532722bc huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq aws:///us-east-2a/i-020b35563810f5fd2 huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq-worker-us-east-2b-2l8vn huliu-aws1017c-zkdqq aws:///us-east-2b/i-0d1ff395da901eeb2 huliu-aws1017c-zkdqq-worker-us-east-2b-2l8vn liuhuali@Lius-MacBook-Pro huali-test % oc get awsmachine -n openshift-cluster-api NAME CLUSTER STATE READY INSTANCEID MACHINE huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq aws:///us-east-2a/i-0fe7455cb532722bc huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn huliu-aws1017c-zkdqq aws:///us-east-2a/i-020b35563810f5fd2 huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn liuhuali@Lius-MacBook-Pro huali-test % liuhuali@Lius-MacBook-Pro huali-test % oc logs cluster-capi-operator-78bd56b648-5llff -c machine-api-migration ... I1017 05:31:03.795937 1 machine_sync_controller.go:816] "Successfully updated Cluster API machine" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" reconcileID="9f6c68e6-afbc-489d-9b6c-0326441aa48b" I1017 05:31:03.796046 1 machine_sync_controller.go:691] "Deleting the corresponding Cluster API infra machine as it is out of date, it will be recreated" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" reconcileID="9f6c68e6-afbc-489d-9b6c-0326441aa48b" diff="map[.spec:[AdditionalSecurityGroups.slice[2].Filters: <nil slice> != [] Subnet.Filters: <nil slice> != []]]" E1017 05:31:03.864591 1 machine_sync_controller.go:710] "Failed to remove finalizer for deleting Cluster API infra machine" err="Operation cannot be fulfilled on awsmachines.infrastructure.cluster.x-k8s.io \"huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz\": the object has been modified; please apply your changes to the latest version and try again" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" reconcileID="9f6c68e6-afbc-489d-9b6c-0326441aa48b" E1017 05:31:03.875550 1 controller.go:347] "Reconciler error" err="unable to ensure Cluster API infra machine: failed to remove finalizer for deleting Cluster API infra machine: Operation cannot be fulfilled on awsmachines.infrastructure.cluster.x-k8s.io \"huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz\": the object has been modified; please apply your changes to the latest version and try again" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" reconcileID="9f6c68e6-afbc-489d-9b6c-0326441aa48b" I1017 05:31:03.920950 1 machine_sync_controller.go:818] "No changes detected for Cluster API machine" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn" reconcileID="61e8f786-b916-4787-a944-e4b46b54e0f9" I1017 05:31:03.968823 1 machine_sync_controller.go:654] "Successfully created Cluster API infra machine" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-cf2zn" reconcileID="61e8f786-b916-4787-a944-e4b46b54e0f9" I1017 05:31:03.982149 1 machine_sync_controller.go:818] "No changes detected for Cluster API machine" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" reconcileID="619e9d68-a4b4-44ae-b5d9-29ee7be3caf4" I1017 05:31:03.982254 1 machine_sync_controller.go:691] "Deleting the corresponding Cluster API infra machine as it is out of date, it will be recreated" controller="MachineSyncController" controllerGroup="machine.openshift.io" controllerKind="Machine" Machine="openshift-machine-api/huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" namespace="openshift-machine-api" name="huliu-aws1017c-zkdqq-worker-us-east-2a-bj8gz" reconcileID="619e9d68-a4b4-44ae-b5d9-29ee7be3caf4" diff="map[.spec:[AdditionalSecurityGroups.slice[2].Filters: <nil slice> != [] Subnet.Filters: <nil slice> != []]]"
Actual results:
awsmachine go into re-create loop and machine report sync error
Expected results:
awsmachine should not go into re-create loop and machine should sync successfully
Additional info:
must-gather: https://drive.google.com/file/d/1-5d_8A4bDR3AogvCDVmG5GJJ6Zjx4PjI/view?usp=sharing new feature testing for https://issues.redhat.com//browse/OCPCLOUD-2709 but seems the issue is not related to it.