-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.16
Description of problem:
In Azure, in CentralUSEUAP, when creating an OCP cluster (applies to ARO too), worker machine fail at being created. Looking at error, message, it seems that the underlying Availability Set creation fails with error
AvailabilitySet "<somethingsomething>" with platformFaultDomainCount = 1 can only support platformUpdateDomainCount = 1
This error echoes some of the things in https://issues.redhat.com/browse/OCPBUGS-45663. The way I understand the MAPI code
https://github.com/openshift/machine-api-provider-azure/blob/5a6516188d4ec33734e1a069da2acc7a469657dc/pkg/cloud/azure/services/availabilitysets/availabilitysets.go#L48
is that to fix OCPBUGS-45663, the platformFaultDomainCount is now computed dynamically to 1 for that special region. But the platformUpdateDomainCount is hardcoded to 5, which sounds to be incompatible with platformUpdateDomainCount set to 1 (apparently, Azure seems to expect platformUpdateDomainCount to be only 1 in that case).
Version-Release number of selected component (if applicable):
observed 4.16, 4.17, 4.18
How reproducible:
systematic
Steps to Reproduce:
1. Create an OCP cluster on Azure (or an ARO cluster) with any of the versions that contains the fix for https://issues.redhat.com/browse/OCPBUGS-45663 in CentralusEUAP
2. Worker Machine creation fail.
3.
Actual results:
MAPI does not create the underlying Worker VM, error appears about "AvailabilitySet "<somethingsomething>" with platformFaultDomainCount = 1 can only support platformUpdateDomainCount = 1"
Expected results:
Worker VM are created and machine goes running
Additional info:
This error echoes some of the things in https://issues.redhat.com/browse/OCPBUGS-45663. The way I understand the MAPI code https://github.com/openshift/machine-api-provider-azure/blob/5a6516188d4ec33734e1a069da2acc7a469657dc/pkg/cloud/azure/services/availabilitysets/availabilitysets.go#L48 is that to fix OCPBUGS-45663, the platformFaultDomainCount is now computed dynamically to 1 for that special region. But the platformUpdateDomainCount is hardcoded to 5, which sounds to be incompatible with platformUpdateDomainCount set to 1 (apparently, Azure seems to expect platformUpdateDomainCount to be only 1 in that case). I am not certain this is something that changed recently on Azure side or if the incompatibility between those two paramaters has always been there.