Uploaded image for project: 'OpenShift Installer'
  1. OpenShift Installer
  2. CORS-2522

Improve bootstrapping/installation progress & error reporting

XMLWordPrintable

    • Icon: Spike Spike
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • None
    • Improvement
    • False
    • None
    • False

      The goal of this spike is to create an epic to improve user visibility into the installation process after infrastructure creation, particularly to improve feedback in the bootkube process.

      At the moment, we simply wrap bootkube in timeouts and fail if we exceed the timeout. This presents problems in both directions:

      • There are some cases where progress is being made but may take longer than the timeouts. This comes up in baremetal installs, where reboots can take significant amounts of time. In these cases we would ideally identify a way to monitor progress without resorting to a hard timeout.
      • On the other hand, an error such as the failure to pull the release image represents the opposite problem: this is an unrecoverable error that happens immediately but we spend X number of minutes waiting for the timeout before returning. It would be better if we could report failure immediately.

       

      We also have seen another class of problem with manifests that fail to apply. These could be user-provided or in some cases they have snuck in from openshift components. These failures can be hard to identify, and require inspecting bootkube logs.

              Unassigned Unassigned
              padillon Patrick Dillon
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: