Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4249

openshift-install agent wait-for install-complete errors out before the cluster installation completes

XMLWordPrintable

    • None
    • Agent Sprint 228
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      11/28: Red as no triage has been done yet.
      11/28: added to the 4.12 gating list
      Show
      11/28: Red as no triage has been done yet. 11/28: added to the 4.12 gating list

      Description of problem:

      While running ./openshift-install agent wait-for install-complete --dir billi --log-level debug on a real bare metal dual stack compact cluster installation it errors out with ERROR Attempted to gather ClusterOperator status after wait failure: Listing ClusterOperator objects: Get "https://api.kni-qe-0.lab.eng.rdu2.redhat.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp [2620:52:0:11c::10]:6443: connect: connection refused but installation is still progressing
      
      DEBUG Uploaded logs for host openshift-master-1 cluster d8b0979d-3d69-4e65-874a-d1f7da79e19e 
      DEBUG Host: openshift-master-1, reached installation stage Rebooting 
      DEBUG Host: openshift-master-1, reached installation stage Configuring 
      DEBUG Host: openshift-master-2, reached installation stage Configuring 
      DEBUG Host: openshift-master-2, reached installation stage Joined 
      DEBUG Host: openshift-master-1, reached installation stage Joined 
      DEBUG Host: openshift-master-0, reached installation stage Waiting for bootkube 
      DEBUG Host openshift-master-1: updated status from installing-in-progress to installed (Done) 
      DEBUG Host: openshift-master-1, reached installation stage Done 
      DEBUG Host openshift-master-2: updated status from installing-in-progress to installed (Done) 
      DEBUG Host: openshift-master-2, reached installation stage Done 
      DEBUG Host: openshift-master-0, reached installation stage Waiting for controller: waiting for controller pod ready event 
      ERROR Attempted to gather ClusterOperator status after wait failure: Listing ClusterOperator objects: Get "https://api.kni-qe-0.lab.eng.rdu2.redhat.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp [2620:52:0:11c::10]:6443: connect: connection refused 
      ERROR Cluster initialization failed because one or more operators are not functioning properly. 
      ERROR 				The cluster should be accessible for troubleshooting as detailed in the documentation linked below, 
      ERROR 				https://docs.openshift.com/container-platform/latest/support/troubleshooting/troubleshooting-installations.html 

      Version-Release number of selected component (if applicable):

      4.12.0-rc.0

      How reproducible:

      100%

      Steps to Reproduce:

      1. ./openshift-install agent create image --dir billi --log-level debug 
      2. mount resulting iso image and reboot nodes via iLO
      3. /openshift-install agent wait-for install-complete --dir billi --log-level debug 

      Actual results:

       ERROR Attempted to gather ClusterOperator status after wait failure: Listing ClusterOperator objects: Get "https://api.kni-qe-0.lab.eng.rdu2.redhat.com:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp [2620:52:0:11c::10]:6443: connect: connection refused 
      
      cluster installation is not complete and it needs more time to complete 

      Expected results:

      waits until the cluster installation completes

      Additional info:

      The cluster installation eventually completes fine if waiting after the error.
      
      Attaching install-config.yaml and agent-config.yaml

            ppinjark@redhat.com pawan pinjarkar
            mcornea@redhat.com Marius Cornea
            zhenying niu zhenying niu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: