Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-1343

ClusterClaim Getting Stuck Despite Installed ClusterDeployment

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • None
    • Quality / Stability / Reliability
    • False
    • None
    • None
    • None
    • None
    • None
    • Hive Sprint 197
    • None
    • None
    • Undefined

      Reported by ACM teams in https://coreos.slack.com/archives/CE3ETN3J8/p1610374927110400

      The ClusterClaim creates a ClusterDeployment, which installs successfully, but the ClusterClaim never shows Pending status and never shows the cluster is ready. It does not get Hibernated, now does it get cleaned up after the lifetime on the claim.

      apiVersion: hive.openshift.io/v1                         
      kind: ClusterClaim                         
      metadata:                           
        creationTimestamp: "2021-01-11T13:40:06Z"
        finalizers:                                                                                                      
        - hive.openshift.io/claim                                                                                                                                                                                                           
        generation: 3      
        name: dhaiduce-grc-cp-v460                                                                                                                                                                                                          
        namespace: acm-grc-security   
        resourceVersion: "71339652"                            
        selfLink: /apis/hive.openshift.io/v1/namespaces/acm-grc-security/clusterclaims/dhaiduce-grc-cp-v460                         
        uid: f5b7b541-8446-4624-b29f-7f272dfbfc5c
      spec:                                                    
        clusterPoolName: grc-cp-v460                                                                                                                                                                                                        
        lifetime: 8h55m                                                                                                  
        namespace: grc-cp-v460-j292h                                 
        subjects:                                                    
        - apiGroup: rbac.authorization.k8s.io                        
          kind: Group                                                
          name: policy-grc             
      

      ClusterDeployment

           apiVersion: hive.openshift.io/v1                
      kind: ClusterDeployment                        
      metadata:                                            
        creationTimestamp: "2021-01-08T13:57:34Z"                
        finalizers:                      
        - hive.openshift.io/deprovision    
        generation: 4                                
        labels:                  
          hive.openshift.io/cluster-platform: aws
          hive.openshift.io/cluster-region: us-east-1   
          hive.openshift.io/version-major: "4"        
          hive.openshift.io/version-major-minor: "4.6"
          hive.openshift.io/version-major-minor-patch: 4.6.0
                  name: grc-cp-v460-j292h                                                                 
        namespace: grc-cp-v460-j292h                                                            
        resourceVersion: "74499312"                                                             
        selfLink: /apis/hive.openshift.io/v1/namespaces/grc-cp-v460-j292h/clusterdeployments/grc-cp-v460-j292h                                                                            
        uid: 200b4d7c-9a48-4042-8812-9cecb103d56e                                               
      spec:                                                                                     
        baseDomain: dev08.red-chesterfield.com                                                  
        clusterMetadata:                                                                        
          adminKubeconfigSecretRef:                                                             
            name: grc-cp-v460-j292h-0-6pzkk-admin-kubeconfig                                    
          adminPasswordSecretRef:                                                               
            name: grc-cp-v460-j292h-0-6pzkk-admin-password                                      
          clusterID: 8077f71f-657b-4bab-996c-a24cbff3f6a2                                       
          infraID: grc-cp-v460-j292h-vr8fs                                                      
        clusterName: grc-cp-v460-j292h                                                          
        clusterPoolRef:                                                                         
          claimName: dhaiduce-grc-cp-v460                                                       
          namespace: acm-grc-security                                                           
          poolName: grc-cp-v460                                                                 
        controlPlaneConfig:                                                                     
          servingCertificates: {}                                                               
        installed: true                                                                         
        platform:                                                                               
          aws:                                                                                  
            credentialsSecretRef:                                                               
              name: grc-cp-v460-j292h-aws-creds                                                 
            region: us-east-1                                                                   
        powerState: Running                                                                     
        provisioning:                                                                           
          imageSetRef:                                                                          
            name: img4.6.0-x86-64-appsub                                                        
          installConfigSecretRef:                                                               
            name: grc-cp-v460-j292h-install-config                                              
        pullSecretRef:                                                                          
          name: grc-cp-v460-j292h-pull-secret                                                   
      status:                                                                                   
        apiURL: https://api.grc-cp-v460-j292h.dev08.red-chesterfield.com:6443                                                                                                             
        cliImage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6aa4bb97adf2142b0e74ccae7fd3661ada73cbaac803b86bb8712261e916d66d                                                                                                                                                                                                                                       
        conditions:                                                                             
        - lastProbeTime: "2021-01-11T13:44:33Z"                                                 
          lastTransitionTime: "2021-01-11T13:44:33Z"                                            
          message: All machines are started and nodes are ready                                 
          reason: Running                                                                       
          status: "False"                                                                       
          type: Hibernating                                                                     
        - lastProbeTime: "2021-01-08T14:28:31Z"                                                 
          lastTransitionTime: "2021-01-08T14:28:31Z"                                            
          message: SyncSet apply is successful                                                  
          reason: SyncSetApplySuccess                                                           
          status: "False"                                                                       
          type: SyncSetFailed                                                                   
        - lastProbeTime: "2021-01-14T11:44:12Z"                                                 
          lastTransitionTime: "2021-01-11T13:44:12Z"                                            
          message: cluster is reachable                                                         
          reason: ClusterReachable                                                              
          status: "False"                                                                       
          type: Unreachable                                                                     
        installedTimestamp: "2021-01-08T14:28:31Z"                                              
        installerImage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:3f206c2ca0472d318ed03d164c7c1502796974da881136060677154bc5432415                                                                                                                                                                                                                                 
        provisionRef:                                                                           
          name: grc-cp-v460-j292h-0-6pzkk                                                       
        webConsoleURL: https://console-openshift-console.apps.grc-cp-v460-j292h.dev08.red-chesterfield.com                            
      

      ClusterPool:

      apiVersion: hive.openshift.io/v1
      kind: ClusterPool
      metadata:
        creationTimestamp: "2021-01-05T15:27:35Z"
        finalizers:
        - hive.openshift.io/clusters
        generation: 1
        name: grc-cp-v460
        namespace: acm-grc-security
        resourceVersion: "71343101"
        selfLink: /apis/hive.openshift.io/v1/namespaces/acm-grc-security/clusterpools/grc-cp-v460
        uid: d66221e4-d414-4684-b650-fd2b20ed0a86
      spec:
        baseDomain: dev08.red-chesterfield.com
        imageSetRef:
          name: img4.6.0-x86-64-appsub
        platform:
          aws:
            credentialsSecretRef:
              name: policy-grc-aws-creds
            region: us-east-1
        pullSecretRef:
          name: policy-grc-ocp-pull-secret
        size: 1
      status:
        ready: 1
        size: 1
      

      The logs appear to be short circuiting doing anything with the claim but not logging why.

            ❯ klf hive-controllers-d5c84dd8d-2ds57| grep dhaiduce-grc-cp-v460 
      time="2021-01-14T12:42:45.899Z" level=info msg="reconciling cluster claim" clusterClaim=acm-grc-security/dhaiduce-grc-cp-v460 controller=clusterclaim
      time="2021-01-14T12:42:45.899Z" level=debug msg="checking whether lifetime of ClusterClaim has elapsed" cluster=grc-cp-v460-j292h clusterClaim=acm-grc-security/dhaiduce-grc-cp-v460 controller=clusterclaim lifetime="&Duration{Duration:8h55m0s,}"
      time="2021-01-14T12:42:45.903Z" level=debug msg="claim has existing cluster assignment" cluster=grc-cp-v460-j292h clusterClaim=acm-grc-security/dhaiduce-grc-cp-v460 controller=clusterclaim
      time="2021-01-14T12:42:45.903Z" level=debug msg="resource is up-to-date" cluster=grc-cp-v460-j292h clusterClaim=acm-grc-security/dhaiduce-grc-cp-v460 controller=clusterclaim resource=grc-cp-v460-j292h/hive-claim-owner
      time="2021-01-14T12:42:45.903Z" level=debug msg="resource is up-to-date" cluster=grc-cp-v460-j292h clusterClaim=acm-grc-security/dhaiduce-grc-cp-v460 controller=clusterclaim resource=grc-cp-v460-j292h/hive-claim-owner
      time="2021-01-14T12:42:45.903Z" level=info msg="reconcile complete" cluster=grc-cp-v460-j292h clusterClaim=acm-grc-security/dhaiduce-grc-cp-v460 controller=clusterclaim elapsed=3.695225ms
      

      Appears to be running: registry.redhat.io/rhacm2/openshift-hive-rhel7@sha256:df159a8db1548356af4e462b11de0691ec6d3b9fa23d1217bb18012db0b3cf81
      ACM reports this was hive at the head of ocm-2.1 branch, or 4e67605a9a5c5d5648fc11488e5374b6db6b525b commit SHA.

              jdiaz@redhat.com Joel Diaz (Inactive)
              rhn-engineering-dgoodwin Devan Goodwin
              None
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: