Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-3706

openshift-install agent wait-for install-complete errors out before the cluster installation completes

XMLWordPrintable

    • Moderate
    • None
    • Agent Sprint 228, Agent Sprint 229, Agent Sprint 230
    • 3
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      12/21: removing this from the 4.12 GA list (this one's for 4.13) - the fix with target 4.12 is OCPBUGS-4962
      12/15: Green per latest comment, PR is ready to merge, waiting on CI
      12/7: R e d as the previous fix was deemed insufficient and bug status has moved to NEW.
      12/5: G r e e n as the fix is posted and is waiting on a successful CI run.
      11/30: lowered Telco rank/bucket to 3, keeping it on the Telco-Grade OCP 4.12 list due to the potential impact to automation
      11/28: R e d as no triage has been done yet.
      11/28: added to the 4.12 gating list
      Show
      12/21: removing this from the 4.12 GA list (this one's for 4.13) - the fix with target 4.12 is OCPBUGS-4962 12/15: Green per latest comment, PR is ready to merge, waiting on CI 12/7: R e d as the previous fix was deemed insufficient and bug status has moved to NEW. 12/5: G r e e n as the fix is posted and is waiting on a successful CI run. 11/30: lowered Telco rank/bucket to 3, keeping it on the Telco-Grade OCP 4.12 list due to the potential impact to automation 11/28: R e d as no triage has been done yet. 11/28: added to the 4.12 gating list

      Description of problem:

      While running ./openshift-install agent wait-for install-complete --dir billi --log-level debug on a real bare metal dual stack compact cluster installation it errors out with ERROR Attempted to gather ClusterOperator status after wait failure: Listing ClusterOperator objects: Get "https://api.kni-qe-0.lab.eng.rdu2.redhat.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp [2620:52:0:11c::10]:6443: connect: connection refused but installation is still progressing
      
      DEBUG Uploaded logs for host openshift-master-1 cluster d8b0979d-3d69-4e65-874a-d1f7da79e19e 
      DEBUG Host: openshift-master-1, reached installation stage Rebooting 
      DEBUG Host: openshift-master-1, reached installation stage Configuring 
      DEBUG Host: openshift-master-2, reached installation stage Configuring 
      DEBUG Host: openshift-master-2, reached installation stage Joined 
      DEBUG Host: openshift-master-1, reached installation stage Joined 
      DEBUG Host: openshift-master-0, reached installation stage Waiting for bootkube 
      DEBUG Host openshift-master-1: updated status from installing-in-progress to installed (Done) 
      DEBUG Host: openshift-master-1, reached installation stage Done 
      DEBUG Host openshift-master-2: updated status from installing-in-progress to installed (Done) 
      DEBUG Host: openshift-master-2, reached installation stage Done 
      DEBUG Host: openshift-master-0, reached installation stage Waiting for controller: waiting for controller pod ready event 
      ERROR Attempted to gather ClusterOperator status after wait failure: Listing ClusterOperator objects: Get "https://api.kni-qe-0.lab.eng.rdu2.redhat.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp [2620:52:0:11c::10]:6443: connect: connection refused 
      ERROR Cluster initialization failed because one or more operators are not functioning properly. 
      ERROR 				The cluster should be accessible for troubleshooting as detailed in the documentation linked below, 
      ERROR 				https://docs.openshift.com/container-platform/latest/support/troubleshooting/troubleshooting-installations.html 

      Version-Release number of selected component (if applicable):

      4.12.0-rc.0

      How reproducible:

      100%

      Steps to Reproduce:

      1. ./openshift-install agent create image --dir billi --log-level debug 
      2. mount resulting iso image and reboot nodes via iLO
      3. /openshift-install agent wait-for install-complete --dir billi --log-level debug 

      Actual results:

       ERROR Attempted to gather ClusterOperator status after wait failure: Listing ClusterOperator objects: Get "https://api.kni-qe-0.lab.eng.rdu2.redhat.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp [2620:52:0:11c::10]:6443: connect: connection refused 
      
      cluster installation is not complete and it needs more time to complete 

      Expected results:

      waits until the cluster installation completes

      Additional info:

      The cluster installation eventually completes fine if waiting after the error.
      
      Attaching install-config.yaml and agent-config.yaml

        1. agent-config.yaml
          3 kB
          Marius Cornea
        2. install-config.yaml
          3 kB
          Marius Cornea

              zabitter Zane Bitter
              mcornea@redhat.com Marius Cornea
              zhenying niu zhenying niu
              pawan pinjarkar
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated:
                Resolved: