Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-2360

Machine Pool scaling doesn't work for Openstack cluster

XMLWordPrintable

    • False
    • None
    • False
    • Critical

      Description of problem:

      After provisioning an openstack cluster on 4.13.11 and 4.14.0-rc.0, we can see that the machine pools are at 0, and any scaling action does not do anything:

      Scaling up to 4 does not initiate any provisioning on the cluster:

      Version-Release number of selected component (if applicable):

      ACM 2.9.0-DOWNSTREAM-2023-09-11-15-47-23

      How reproducible:

      Steps to Reproduce:

      1. deploy openstack cluster on 4.14.0-rc.0
      2. check machine pools for cluster
      3. ...

      Actual results:

      Expected results:

      Additional info:

      CD (hub):

       apiVersion: hive.openshift.io/v1
      kind: ClusterDeployment
      metadata:
        annotations:
          open-cluster-management.io/user-group: c3lzdGVtOmNsdXN0ZXItYWRtaW5zLHN5c3RlbTphdXRoZW50aWNhdGVk
          open-cluster-management.io/user-identity: a3ViZTphZG1pbg==
        creationTimestamp: "2023-09-12T15:41:47Z"
        finalizers:
        - hive.openshift.io/deprovision
        generation: 3
        labels:
          cloud: OpenStack
          cluster.open-cluster-management.io/clusterset: default
          hive.openshift.io/cluster-platform: openstack
          hive.openshift.io/cluster-region: unknown
          hive.openshift.io/version: 4.14.0-rc.0
          hive.openshift.io/version-major: "4"
          hive.openshift.io/version-major-minor: "4.14"
          hive.openshift.io/version-major-minor-patch: 4.14.0
          vendor: OpenShift
        name: clc-auto-psi
        namespace: clc-auto-psi
        resourceVersion: "803680"
        uid: af6af255-ff7c-42d1-812d-7717eb9b40d5
      spec:
        baseDomain: dev09.red-chesterfield.com
        clusterMetadata:
          adminKubeconfigSecretRef:
            name: clc-auto-psi-0-nr48s-admin-kubeconfig
          adminPasswordSecretRef:
            name: clc-auto-psi-0-nr48s-admin-password
          clusterID: be7d71b6-f1f1-4d60-a20c-fefb7f45e157
          infraID: clc-auto-psi-cz4qp
        clusterName: clc-auto-psi
        controlPlaneConfig:
          servingCertificates: {}
        installAttemptsLimit: 1
        installed: true
        platform:
          openstack:
            certificatesSecretRef:
              name: clc-auto-psi-openstack-trust
            cloud: openstack
            credentialsSecretRef:
              name: clc-auto-psi-openstack-creds
        provisioning:
          imageSetRef:
            name: img4.14.0-rc.0-multi
          installConfigSecretRef:
            name: clc-auto-psi-install-config
          sshPrivateKeySecretRef:
            name: clc-auto-psi-ssh-private-key
        pullSecretRef:
          name: clc-auto-psi-pull-secret
      status:
        apiURL: https://api.clc-auto-psi.dev09.red-chesterfield.com:6443
        cliImage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:c6fde16873a3def595063f2ae2a7ea786207d548fae3f4a174aab181cfd8207c
        conditions:
        - lastProbeTime: "2023-09-12T16:21:43Z"
          lastTransitionTime: "2023-09-12T16:21:43Z"
          message: 'Unsupported platform: no actuator to handle it'
          reason: Unsupported
          status: "False"
          type: Hibernating
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Platform credentials passed authentication check
          reason: PlatformAuthSuccess
          status: "False"
          type: AuthenticationFailure
        - lastProbeTime: "2023-09-12T16:21:43Z"
          lastTransitionTime: "2023-09-12T16:21:43Z"
          message: Control plane certificates are present
          reason: ControlPlaneCertificatesFound
          status: "False"
          type: ControlPlaneCertificateNotFound
        - lastProbeTime: "2023-09-12T15:42:00Z"
          lastTransitionTime: "2023-09-12T15:42:00Z"
          message: Images required for cluster deployment installations are resolved
          reason: ImagesResolved
          status: "False"
          type: InstallImagesNotResolved
        - lastProbeTime: "2023-09-12T15:42:24Z"
          lastTransitionTime: "2023-09-12T15:42:24Z"
          message: Successfully launched install pod
          reason: InstallLaunchSuccessful
          status: "False"
          type: InstallLaunchError
        - lastProbeTime: "2023-09-12T15:41:56Z"
          lastTransitionTime: "2023-09-12T15:41:56Z"
          message: InstallerImage is resolved.
          reason: InstallerImageResolved
          status: "False"
          type: InstallerImageResolutionFailed
        - lastProbeTime: "2023-09-12T16:21:43Z"
          lastTransitionTime: "2023-09-12T16:21:43Z"
          message: Provision clc-auto-psi-0-nr48s succeeded.
          reason: ProvisionSucceeded
          status: "False"
          type: ProvisionFailed
        - lastProbeTime: "2023-09-12T15:42:00Z"
          lastTransitionTime: "2023-09-12T15:42:00Z"
          message: Provision is not stopped
          reason: ProvisionNotStopped
          status: "False"
          type: ProvisionStopped
        - lastProbeTime: "2023-09-12T16:21:43Z"
          lastTransitionTime: "2023-09-12T16:21:43Z"
          message: Cluster is provisioned
          reason: Provisioned
          status: "True"
          type: Provisioned
        - lastProbeTime: "2023-09-12T16:21:43Z"
          lastTransitionTime: "2023-09-12T16:21:43Z"
          message: No power state actuator -- assuming running
          reason: Running
          status: "True"
          type: Ready
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: no ClusterRelocates match
          reason: NoMatchingRelocates
          status: "False"
          type: RelocationFailed
        - lastProbeTime: "2023-09-12T15:42:00Z"
          lastTransitionTime: "2023-09-12T15:42:00Z"
          message: All pre-provision requirements met
          reason: AllRequirementsMet
          status: "True"
          type: RequirementsMet
        - lastProbeTime: "2023-09-12T16:21:45Z"
          lastTransitionTime: "2023-09-12T16:21:45Z"
          message: SyncSet apply is successful
          reason: SyncSetApplySuccess
          status: "False"
          type: SyncSetFailed
        - lastProbeTime: "2023-09-12T18:21:43Z"
          lastTransitionTime: "2023-09-12T16:21:43Z"
          message: cluster is reachable
          reason: ClusterReachable
          status: "False"
          type: Unreachable
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: AWSPrivateLinkFailed
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: AWSPrivateLinkReady
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: ActiveAPIURLOverride
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: ClusterInstallCompleted
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: ClusterInstallFailed
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: ClusterInstallRequirementsMet
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: ClusterInstallStopped
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: DNSNotReady
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: DeprovisionLaunchError
        - lastProbeTime: "2023-09-12T15:41:48Z"
          lastTransitionTime: "2023-09-12T15:41:48Z"
          message: Condition Initialized
          reason: Initialized
          status: Unknown
          type: IngressCertificateNotFound
        installStartedTimestamp: "2023-09-12T15:42:00Z"
        installVersion: 4.14.0-rc.0
        installedTimestamp: "2023-09-12T16:21:43Z"
        installerImage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e4aa8f7b1caf1a4674d463e5d96987711cd77d9e83f33b912b02441b2cc15d13
        powerState: Running
        provisionRef:
          name: clc-auto-psi-0-nr48s
        webConsoleURL: https://console-openshift-console.apps.clc-auto-psi.dev09.red-chesterfield.com

      MachinePool (hub):

       apiVersion: v1
      items:
      - apiVersion: hive.openshift.io/v1
        kind: MachinePool
        metadata:
          creationTimestamp: "2023-09-12T15:41:47Z"
          finalizers:
          - hive.openshift.io/remotemachineset
          generation: 2
          name: clc-auto-psi-worker
          namespace: clc-auto-psi
          resourceVersion: "697731"
          uid: c3f267be-1996-41ec-b3c8-8eaffd130215
        spec:
          clusterDeploymentRef:
            name: clc-auto-psi
          name: worker
          platform:
            openstack:
              flavor: ocp-master-large
          replicas: 4
        status:
          conditions:
          - lastProbeTime: "2023-09-12T15:41:47Z"
            lastTransitionTime: "2023-09-12T15:41:47Z"
            message: Condition Initialized
            reason: Initialized
            status: Unknown
            type: NotEnoughReplicas
          - lastProbeTime: "2023-09-12T15:41:47Z"
            lastTransitionTime: "2023-09-12T15:41:47Z"
            message: Condition Initialized
            reason: Initialized
            status: Unknown
            type: NoMachinePoolNameLeasesAvailable
          - lastProbeTime: "2023-09-12T15:41:47Z"
            lastTransitionTime: "2023-09-12T15:41:47Z"
            message: Condition Initialized
            reason: Initialized
            status: Unknown
            type: InvalidSubnets
          - lastProbeTime: "2023-09-12T15:41:47Z"
            lastTransitionTime: "2023-09-12T15:41:47Z"
            message: Condition Initialized
            reason: Initialized
            status: Unknown
            type: UnsupportedConfiguration
      kind: List
      metadata:
        resourceVersion: ""

      MachineSet (cluster):

      NAME                          DESIRED   CURRENT   READY   AVAILABLE   AGE
      clc-auto-psi-cz4qp-worker-0   3         3         3       3           179m
      
      apiVersion: v1
      items:
      - apiVersion: machine.openshift.io/v1beta1
        kind: MachineSet
        metadata:
          annotations:
            machine.openshift.io/memoryMb: "16384"
            machine.openshift.io/vCPU: "16"
          creationTimestamp: "2023-09-12T15:52:02Z"
          generation: 1
          labels:
            machine.openshift.io/cluster-api-cluster: clc-auto-psi-cz4qp
            machine.openshift.io/cluster-api-machine-role: worker
            machine.openshift.io/cluster-api-machine-type: worker
          name: clc-auto-psi-cz4qp-worker-0
          namespace: openshift-machine-api
          resourceVersion: "31044"
          uid: 79555ee5-a675-4007-81c8-c70452c192a5
        spec:
          replicas: 3
          selector:
            matchLabels:
              machine.openshift.io/cluster-api-cluster: clc-auto-psi-cz4qp
              machine.openshift.io/cluster-api-machineset: clc-auto-psi-cz4qp-worker-0
          template:
            metadata:
              labels:
                machine.openshift.io/cluster-api-cluster: clc-auto-psi-cz4qp
                machine.openshift.io/cluster-api-machine-role: worker
                machine.openshift.io/cluster-api-machine-type: worker
                machine.openshift.io/cluster-api-machineset: clc-auto-psi-cz4qp-worker-0
            spec:
              lifecycleHooks: {}
              metadata: {}
              providerSpec:
                value:
                  apiVersion: machine.openshift.io/v1alpha1
                  cloudName: openstack
                  cloudsSecret:
                    name: openstack-cloud-credentials
                    namespace: openshift-machine-api
                  flavor: ocp-master-large
                  image: clc-auto-psi-cz4qp-rhcos
                  kind: OpenstackProviderSpec
                  metadata:
                    creationTimestamp: null
                  networks:
                  - filter: {}
                    subnets:
                    - filter:
                        name: clc-auto-psi-cz4qp-nodes
                        tags: openshiftClusterID=clc-auto-psi-cz4qp
                  securityGroups:
                  - filter: {}
                    name: clc-auto-psi-cz4qp-worker
                  serverGroupName: clc-auto-psi-cz4qp-worker
                  serverMetadata:
                    Name: clc-auto-psi-cz4qp-worker
                    openshiftClusterID: clc-auto-psi-cz4qp
                  tags:
                  - openshiftClusterID=clc-auto-psi-cz4qp
                  trunk: true
                  userDataSecret:
                    name: worker-user-data
        status:
          availableReplicas: 3
          fullyLabeledReplicas: 3
          observedGeneration: 1
          readyReplicas: 3
          replicas: 3
      kind: List
      metadata:
        resourceVersion: ""

      Machine (cluster):

       oc get machines.machine.openshift.io -n openshift-machine-api
      NAME                                PHASE     TYPE               REGION      ZONE   AGE
      clc-auto-psi-cz4qp-master-0         Running   ocp-master-large   regionOne   nova   3h
      clc-auto-psi-cz4qp-master-1         Running   ocp-master-large   regionOne   nova   3h
      clc-auto-psi-cz4qp-master-2         Running   ocp-master-large   regionOne   nova   3h
      clc-auto-psi-cz4qp-worker-0-8hl6f   Running   ocp-master-large   regionOne   nova   171m
      clc-auto-psi-cz4qp-worker-0-g48lm   Running   ocp-master-large   regionOne   nova   171m
      clc-auto-psi-cz4qp-worker-0-rbfz9   Running   ocp-master-large   regionOne   nova   171m

        1. hive-controllers-78f7d6666-brp96-manager.log
          2.03 MB
          David Huynh
        2. image (1).png
          103 kB
          Atif Shafi
        3. image-2023-09-12-09-26-52-416.png
          42 kB
          David Huynh
        4. image-2023-09-12-09-27-31-597.png
          56 kB
          David Huynh
        5. image-2023-09-12-09-27-56-320.png
          113 kB
          David Huynh
        6. image-2023-10-03-15-12-49-442.png
          41 kB
          David Huynh
        7. image-2023-11-01-14-55-07-637.png
          108 kB
          Atif Shafi
        8. image-2023-11-13-11-10-01-022.png
          310 kB
          Atif Shafi
        9. image-2023-11-14-13-09-01-726.png
          390 kB
          Atif Shafi
        10. image-2023-11-14-13-11-03-473.png
          318 kB
          Atif Shafi
        11. mihaung1314-invalidCA.yaml
          7 kB
          Mingxia Huang
        12. mihuang1041-3-2h67k-provision-sc47s.log
          451 kB
          Mingxia Huang
        13. mihuang1413-0-4d8p6-provision-56tk6-invalidCA.log
          5 kB
          Mingxia Huang
        14. mihuang1413-0-4d8p6-provision-56tk6-invalidCA-1.log
          5 kB
          Mingxia Huang
        15. mihuang1413-0-zs2q5-provision-l2tjs-error-account.log
          5 kB
          Mingxia Huang
        16. mihuang1413-0-zs2q5-provision-l2tjs-error-account.yaml
          5 kB
          Mingxia Huang
        17. mihuang1413-error-account.yaml
          7 kB
          Mingxia Huang
        18. mihuang934-0-kmqjx-provision-brr2t-error-cert.log
          5 kB
          Mingxia Huang
        19. mihuang934-cd-conditions.yaml
          7 kB
          Mingxia Huang
        20. mihuang934-machinepool-worker-conditions.yaml
          2 kB
          Mingxia Huang
        21. mihuang-error-cert.log
          5 kB
          Mingxia Huang
        22. mihuang-error-cert-1.log
          5 kB
          Mingxia Huang
        23. mihuang-error-cert-2.log
          5 kB
          Mingxia Huang
        24. no-cert-clc-auto-psi-0-9zcvf-provision-r2kt2-hive.log
          299 kB
          Atif Shafi
        25. with-cert-clc-auto-psi-0-2hng2-provision-69hwg-hive.log
          296 kB
          Atif Shafi

              Unassigned Unassigned
              rhn-support-dhuynh David Huynh
              Mingxia Huang Mingxia Huang
              ACM QE Team
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: