Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-3752

[4.10] Restart of a master resulted in creation of a new firmwareschema without removing the old one and assigning the new to any host's FSH

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 3
    • Moderate
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      After restart of one of a masters hosts metal3 pod is restarted as well. I see that new schema is created. None of old are not removed. The new schema is not assigned to any host's HFS

      $ oc get pods -n openshift-machine-api -o wide
      NAME                                           READY   STATUS    RESTARTS   AGE   IP             NODE                 NOMINATED NODE   READINESS GATES
      cluster-autoscaler-operator-76f76d6747-mdhw4   2/2     Running   0          15h   10.129.0.9     openshift-master-0   <none>           <none>
      cluster-baremetal-operator-c4bc97f9-7zvsg      2/2     Running   0          15h   10.129.0.5     openshift-master-0   <none>           <none>
      machine-api-controllers-64f578fff7-xq8fk       7/7     Running   0          15h   10.128.0.10    openshift-master-2   <none>           <none>
      machine-api-operator-d676654f5-ff6jk           2/2     Running   0          15h   10.129.0.7     openshift-master-0   <none>           <none>
      metal3-b56d755df-wwzmq                         9/9     Running   0          30m   10.46.29.130   openshift-master-1   <none>           <none>
      metal3-image-cache-hbw58                       1/1     Running   0          15h   10.46.29.131   openshift-master-2   <none>           <none>
      metal3-image-cache-mz4sw                       1/1     Running   1          15h   10.46.29.130   openshift-master-1   <none>           <none>
      metal3-image-cache-p8nr9                       1/1     Running   0          15h   10.46.29.129   openshift-master-0   <none>           <none>
      metal3-image-customization-666c6584bc-95bs9    1/1     Running   0          15h   10.129.0.40    openshift-master-0   <none>           <none>

      $ oc get hfs -n openshift-machine-api
      NAME                 AGE
      openshift-master-0   15h
      openshift-master-1   15h
      openshift-master-2   15h
      openshift-worker-0   15h
      openshift-worker-1   15h

      $oc get firmwareschema -n openshift-machine-api
      NAME              AGE
      schema-49c0740e   15h
      schema-4f53cda1   61m
      schema-5179a96a   15h

      $ oc describe hfs openshift-master-1 -n openshift-machine-api |grep schem
              f:schema:
          Name:       schema-49c0740e
      $ oc describe hfs openshift-master-0 -n openshift-machine-api |grep schem
              f:schema:
          Name:       schema-49c0740e
      $ oc describe hfs openshift-master-2 -n openshift-machine-api |grep schem
              f:schema:
          Name:       schema-49c0740e
      $ oc describe hfs openshift-worker-0 -n openshift-machine-api |grep schem
              f:schema:
          Name:       schema-49c0740e
      $ oc describe hfs openshift-worker-1 -n openshift-machine-api |grep schem
              f:schema:
          Name:       schema-5179a96a

      Version-Release number of selected component (if applicable):

      4.10.0-0.nightly-2022-11-10-124234

      How reproducible:

      ~25%

      Steps to Reproduce:

      1. Deploy a cluster. Check how many schemas objects are created
      2. Restart one of masters. 
      3. Check the number of schemas after both master and metal3 pod are back to Running state
      

      Actual results:

      New not assigned to any host's HFS schema created

      Expected results:

      No change in schemas number OR the new schema should replace one of old schemas

      Additional info:

      1. test run on real BM setup (HP)
      2. Will add must-gather
      3. Going to re-run the same test to check how often it happens
      4. Thinking about testing in other version
      5. Waited for about an hour in hope it will be deleted by some kind of garbage collection

       

       

              hroy@redhat.com Himanshu Roy
              lshilin Lubov Shilin
              None
              None
              Pedro Jose Amoedo Martinez Pedro Jose Amoedo Martinez
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: