Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4998

wait-for command doesn't handle installing-pending-user-action

XMLWordPrintable

    • Moderate
    • None
    • Agent Sprint 232, Agent Sprint 233, Sprint 235
    • 3
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      During an installation, when the cluster status is `installing-pending-user-action` the installation won't complete until the status is resolved. Previously, if the user ran the `openshift-install agent wait-for bootstrap-complete` command there would be no indication of how to resolve the problem causing this status. With this update, the command output provides a message indicating which actions must be taken by the user to resolve the issue.

      As an example, the `wait-for` output when an invalid boot disk is used is now:
      [source,terminal]
      ----
      "level=info msg=Cluster has hosts requiring user input
      level=debug msg=Host master-1 Expected the host to boot from disk, but it booted the installation image - please reboot and fix boot order to boot from disk QEMU_HARDDISK drive-scsi0-0-0-0 (sda, /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0)
      level=debug msg=Host master-2 Expected the host to boot from disk, but it booted the installation image - please reboot and fix boot order to boot from disk QEMU_HARDDISK drive-scsi0-0-0-0 (sda, /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0)
      level=info msg=cluster has stopped installing... working to recover installation"
      ----
      (link:https://issues.redhat.com/browse/OCPBUGS-4998[*OCPBUGS-4998*])
      Show
      During an installation, when the cluster status is `installing-pending-user-action` the installation won't complete until the status is resolved. Previously, if the user ran the `openshift-install agent wait-for bootstrap-complete` command there would be no indication of how to resolve the problem causing this status. With this update, the command output provides a message indicating which actions must be taken by the user to resolve the issue. As an example, the `wait-for` output when an invalid boot disk is used is now: [source,terminal] ---- "level=info msg=Cluster has hosts requiring user input level=debug msg=Host master-1 Expected the host to boot from disk, but it booted the installation image - please reboot and fix boot order to boot from disk QEMU_HARDDISK drive-scsi0-0-0-0 (sda, /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0) level=debug msg=Host master-2 Expected the host to boot from disk, but it booted the installation image - please reboot and fix boot order to boot from disk QEMU_HARDDISK drive-scsi0-0-0-0 (sda, /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0) level=info msg=cluster has stopped installing... working to recover installation" ---- (link: https://issues.redhat.com/browse/OCPBUGS-4998 [* OCPBUGS-4998 *])
    • Bug Fix
    • Done

      If the cluster enters the installing-pending-user-action state in assisted-service, it will not recover absent user action.
      One way to reproduce this is to have the wrong boot order set in the host, so that it reboots into the agent ISO again instead of the installed CoreOS on disk. (I managed this in dev-scripts by setting a root device hint that pointed to a secondary disk, and only creating that disk once the VM was up. This does not add the new disk to the boot order list, and even if you set it manually it does not take effect until after a full shutdown of the VM - the soft reboot doesn't count.)

      Currently we report:

      cluster has stopped installing... working to recover installation

      in a loop. This is not accurate (unlike in e.g. the install-failed state) - it cannot be recovered automatically.

      Also we should only report this, or any other, status once when the status changes, and not continuously in a loop.

              bfournie@redhat.com Robert Fournier
              zabitter Zane Bitter
              zhenying niu zhenying niu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: