Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-1997

Assisted Installed fails to deploy spoke compact cluster

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Undefined
    • None
    • None
    • False
    • None
    • False

    Description

      Description of the problem:

      As a part of Altiostar deployment, when deploying the spoke compact cluster (3 masters only) deployment fails (times out). After investigation hive-operator pod is crashing and nodes report unreachability. If agents are manually restarted and hive-operator pod is re-created there is a new attempt to deploy cluster until hive-operator crash again.

      How reproducible:

      Occasional, first noticed with 4.8.32 two weeks ago, now happening on 4.8.43

      ACM 2.4.6

      Steps to reproduce:

      1. Start deploying compact cluster

      2. Wait for agents to register

      3.

      Actual results:

      Deployment does not proceed with 

      • The cluster has hosts that are not ready to install.

      hive-operator is crashing with following in a log:
      time="2022-08-10T11:29:30Z" level=info msg="reconcile complete" controller=hive elapsedMillis=620 elapsedMillisGT=0 outcome=unspecified
      panic: runtime error: invalid memory address or nil pointer dereference
      [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x123e5fc]
       

      Expected results:

      Deployment continues

      Attachments

        1. logs.tar.gz
          14 kB
          Alexander Gurenko
        2. must-gather.local.7202066316332306603.tar.xz
          28.61 MB
          Alexander Gurenko
        3. Screenshot_20220810_134502.png
          54 kB
          Alexander Gurenko

        Activity

          People

            efried.openshift Eric Fried
            agurenko@redhat.com Alexander Gurenko
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: