Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-73622

At 10+ parallel SNO deployment - SNO Power-on failure

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      
      During parallel deployments, specifically with hypervisor based SNOs, some systems "hang" in the middle of the deployment. This looks like a potential race condition involving the BMH. The issue is that the SNO is never powered on by metal3 after virtual media is attached. The workaround is to patch "online" to "true" for the impacted BMH CRs. But, this is not really feasible for deployment scale testing
      
          

      Version-Release number of selected component (if applicable):

      
      Openshift 4.18.13
      
      
          

      How reproducible:

      
          

      Steps to Reproduce:

          1. Deploy SNO in parallel
          2. Once scale exceeds 10+, issues come up
          3.
          

      Actual results:

       Appears to the tune of 80% success, with manual power-ons post workaround required.
      
          

      Expected results:

       All SNOs power-on without issue/intervention
      
          

      Additional info:

      
      Ref - SF 04307628
      
          

              jpoulin Jeremy Poulin
              dacarpen@redhat.com Darren Carpenter
              None
              None
              Rama Kasturi Narra Rama Kasturi Narra
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: