Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-23918

4.20-ec5 Spoke Cluster Installation fails with "ostree-prepare-root.service: Failed with result 'exit-code'. Failed to start OSTree Prepare OS/"

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • Moderate
    • None

      Description of problem: 

      Our partner is trying to install 4.20-ec5 spoke cluster with ZTP/Gitops approach in disconnected environment. ACM version is 2.14 and MCE version is 2.9.0-212. When installation of master nodes start, master nodes completes the step "writing to disk" and then stuck at "configuring" step. When we check nodes from BMC console, we see that  there is a message of "ostree-prepare-root.service: Failed with result 'exit-code'. Failed to start OSTree Prepare OS/." and it keeps getting rebooted many times. It is not possible to ssh to these failing master nodes as well as they cannot be booted. We tried to restart installation of spoke cluster couple of times but each time one of the random master nodes fail at the same step.
      
      Later on, we also tried to replicate same issue from OCP 4.18 hub with ACM 2.13 and MCE 2.8 for 4.20-ec5 spoke cluster and we had some similar "ostree" issue on some of the nodes as well. 
      
      I am attaching the clusterinstance.yaml that is used during these spoke cluster deployment attempts.  Some screenshots from BMC console and some related logs.
      
      Clusterinstance.yaml: https://drive.google.com/file/d/1vXbI1epzvstvW0wfmQd40-2w-DXPZ6ZB/view?usp=sharing 
      

      Version-Release number of selected component (if applicable):

      Spoke cluster:
      OCP 4.20-ec5
       
      Hub cluster:
      OCP 4.20-ec5
      ACM 2.14
      MCE 2.9.0-212
      

      How reproducible:

      Deploy a OCP 4.20-ec5 spoke cluster with ZTP/Gitops approach on disconnected environment. Hub cluster can be either 4.18 or 4.20 and either with ACM 2.14 and ACM 2.13. 

      Steps to Reproduce:

      1. Create clusterinstance for spoke cluster which is using OCP 4.20-ec5.
      2. Deploy cluster with ZTP/Gitops procedure.
      3. Observe that in each installation attempt, some of the master nodes (some cases worker nodes) fail at "configuring" step just after the "writing to disk" step.
      4. Check BMC console of node and observe the "ostree-prepare-root.service: Failed"

      Actual results:

      Spoke Cluster deployment fails at step of "Configuring" on ACM console and BMC console of Nodes show "ostree" error.

      Expected results:

      Spoke cluster installation should be completed successfully.

      Additional info:

        1. boot-failed.png
          45 kB
          Sarp Koksal
        2. ostree.png
          584 kB
          Sarp Koksal

              Unassigned Unassigned
              skoksal@redhat.com Sarp Koksal
              Vladislav Kolodny Vladislav Kolodny
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: