Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4240

assisted-installer-controller job does not complete properly

    XMLWordPrintable

Details

    • Important
    • 8
    • Agent Sprint 232, Agent Sprint 233, Sprint 235, Sprint 240
    • 4
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      Previously, the assisted-installer-controller on the installed cluster would run continuously even after the cluster had completed installation. Because assisted-service runs on the bootstrap node and not on the cloud, and because the assisted-service goes offline after the bootstrap node reboots to join the cluster, the assisted-installer-controller was unable to communicate with assisted-service to post updates and upload logs and loops. In this version, the bug fix modifies the assisted-installer-controller to check the cluster installation without using assisted-service, and to exit when the cluster installation is complete. (link:https://issues.redhat.com/browse/OCPBUGS-4240[*OCPBUGS-4240*])
      Show
      Previously, the assisted-installer-controller on the installed cluster would run continuously even after the cluster had completed installation. Because assisted-service runs on the bootstrap node and not on the cloud, and because the assisted-service goes offline after the bootstrap node reboots to join the cluster, the assisted-installer-controller was unable to communicate with assisted-service to post updates and upload logs and loops. In this version, the bug fix modifies the assisted-installer-controller to check the cluster installation without using assisted-service, and to exit when the cluster installation is complete. (link: https://issues.redhat.com/browse/OCPBUGS-4240 [* OCPBUGS-4240 *])
    • Bug Fix
    • Done
    • 6/28: pending next steps re: KCS publication and/or plan to resolve issue

    Description

      Description of problem:

      After the installation of a cluster, based on the agent installer ISO, is completed, the job assisted-installer-controller remains up

      Version-Release number of selected component (if applicable):

      4.12

      How reproducible:

      Generate a valid ISO image using the agent installer. All kind of topologies (compact/ha/sno) and configurations are affect by this problem

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      $ oc get jobs -n assisted-installer
      NAME                            COMPLETIONS   DURATION   AGE
      assisted-installer-controller   0/1           102m       102m

      Expected results:

      oc get jobs -n assisted-installer should not return any job

      Additional info:

      It looks like that the assisted-installer-controller has been designed assuming that Assisted Service (AS) was always available and reachable. This is not necessarily true when using the agent installer, since the AS initially running on the rendezvous node will not be available after the node was rebooted.
      
      The assisted-installer-controller performs a number of different tasks internally, and from the logs not all of them complete successfully (a condition to terminate the job).
      It could be useful to perform a deeper troubleshooting on the ApproveCsrs one, as it one that does not terminate properly

       

       

       

      Attachments

        Issue Links

          Activity

            People

              rwsu1@redhat.com Richard Su
              afasano@redhat.com Andrea Fasano
              zhenying niu zhenying niu
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: