Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4405

Azure UPI installation failed to scale up worker nodes using machinests

    XMLWordPrintable

Details

    • Moderate
    • 2
    • Sprint 229
    • 1
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:
      Cannot scale up worker node have deploying OCP 4.11.1 cluster via UPI on Azure

      5h2m Warning FailedCreate machine/pokus-2knkh-worker-northeurope1-f6kc4 InvalidConfiguration: failed to reconcile machine "pokus-2knkh-worker-northeurope1-f6kc4": failed to create vm pokus-2knkh-worker-northeurope1-f6kc4: failure sending request for machine pokus-2knkh-worker-northeurope1-f6kc4: cannot create vm: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=404 - Original Error: Code="NotFound" Message="The Image '/subscriptions/e639e479-2737-4b3d-b338-f1928f6429a1/resourceGroups/mlpipe-2163-azpln-rg/providers/Microsoft.Compute/images/pokus-2knkh-gen2' cannot be found in 'northeurope' region."
      

      Customer would like to have the installer create machineset from the inital installation, therefore Kubernetes manifest files that define the worker machines were not removed during the installation.

      Highlights:
      Can I please let help verifying if these are the correct steps to have the initial installation created and manage the worker machines?Is there an explanation on how changing the image to -gen2 in [concat(parameters('baseName'),'-gen2')] from the 02_storage.json template can resolve the problem?
      Version-Release number of selected component (if applicable):

      Environment:
      OCP 4.11.1 UPI install on Azure using ARM
      VM size:
      bootstrap: Standard_D4s_v3
      master: Standard_D4s_v3

      How reproducible:
      Always

      Steps to Reproduce:
      Following the step described in the document: Installing a cluster on Azure using ARM templates .

      In the install-config.yaml, worker replicas was set to 0

      compute:
      - architecture: amd64
        hyperthreading: Enabled
        name: worker
        platform: {}
        replicas: 3   
      controlPlane:
        architecture: amd64
        hyperthreading: Enabled
        name: master
        platform: {}
        replicas: 3
      

      After creating the manifests described in this step: Creating the Kubernetes manifest and Ignition config files only control plane machines manifests were removed, worker machines manifests remain untouchedAfter three masters and three worker nodes were created by ARM templates, additional worker were added using machine sets via command

      oc scale --replicas=1 machineset cluster-g7rzv-worker-francecentral1 -n openshift-machine-api` 

      Actual results:
      No addition node visible from `oc get nodes` and the following error occur:

      5h2m Warning FailedCreate machine/pokus-2knkh-worker-northeurope1-f6kc4 InvalidConfiguration: failed to reconcile machine "pokus-2knkh-worker-northeurope1-f6kc4": failed to create vm pokus-2knkh-worker-northeurope1-f6kc4: failure sending request for machine pokus-2knkh-worker-northeurope1-f6kc4: cannot create vm: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=404 - Original Error: Code="NotFound" Message="The Image '/subscriptions/e639e479-2737-4b3d-b338-f1928f6429a1/resourceGroups/mlpipe-2163-azpln-rg/providers/Microsoft.Compute/images/pokus-2knkh-gen2' cannot be found in 'northeurope' region."
      

      The customer found out that this can be resolved if changing the -image to -gen2 in [concat(parameters('baseName'),'-gen2')] from the 02_storage.json template

      Expected results:
      The installer should be able to create and manage machineset

      Additional info:
      SFDC case #03304526

      Slack discussion, might due to MAO not able to support UPI in Azure Thread1, Thread2
       

       

       

       

       

       

       

       

       

      Attachments

        Activity

          People

            rdossant Rafael Fonseca dos Santos
            rhn-support-kwwong Jaime Wong
            May Xu May Xu
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: