Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-18786

Test combination of VMs and Node Self Remediation Compact Cluster

XMLWordPrintable

    • compact-cluster-operations
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      1. Cover a 3 compact OCP cluster with CNV
      2. Action items on the pitfalls (let it be docs, or code changes)

      Show
      1. Cover a 3 compact OCP cluster with CNV 2. Action items on the pitfalls (let it be docs, or code changes)
    • Yellow
    • To Do
    • CNV-25892 - Fencing - Compact & FAR
    • CNV-25892Fencing - Compact & FAR
    • 100
    • 100% 100%
    • dev-ready, doc-ready, po-ready, px-ready, qe-ready, ux-ready
    • Hide

      2023-04-17: more work than expected, might slip int 4.14, but does not block 4.13

       

       

      Reasons for epic to be in yellow as we expect more test around these areas once they are fixed.

      In a 3 node cluster, When node goes down, I see connectivi...

      Show
      2023-04-17: more work than expected, might slip int 4.14, but does not block 4.13     Reasons for epic to be in yellow as we expect more test around these areas once they are fixed. In a 3 node cluster, When node goes down, I see connectivi...
    • ---
    • ---

      Goal

      Test NHC/SNR on Compact Cluster before our customers do.
      We should start the testing as soon as the operator is available to us, even before released.

      Identify pitfalls that arise in a compact cluster due to differences between control plane and worker nodes.
      I.e.

      • NHC/SNR can not fence ctl plane nodes today
      • Any special considerations for networking between workers and ctl plane nodes?
      • Anything to consider for affinity due to the different node pools
      • Implications of the different node pools on update flow?

      User Stories

      • As a RHV Cluster owner I want to run OCP with CNV on a similar BM footprint so that I do not need to get more or more expensive hardware.
      • As a RHV cluster owner I want to have HA for VMs on my compact cluster so that I get comparable functionality than RHV
      • As a RHV cluster owner I would like to minimize the downtime for any of my VMs in case a node failed
      • As a RHV cluster owner I would like to understand the different timeouts I can set, what are "safe" values and what are the risks if selecting timeouts that are lower than the "safe" ones
      • As a RHV cluster owner I would like to understand how to calculate the minimal values that are HW dependent 

      On a compact cluster with Node Remediation (poison pill) operator installed and shared storage:

      • As a VM owner I would like my VM restart on another node within the same amount of time it takes or non-compact cluster in case the node it's running on fails.

      Non-Requirements

      • List of things not included in this epic, to alleviate any doubt raised during the grooming process.

      Notes

      • Any additional details or decisions made/needed

      Done Checklist

      Who What Reference
           
           
           
           
           
      QE Test plans in Polarion https://polarion.engineering.redhat.com/polarion/#/project/CNV/workitem?id=CNV-7092
      QE Automated tests merged https://code.engineering.redhat.com/gerrit/c/cnv-tests/+/422251
           

            dholler@redhat.com Dominik Holler
            fdeutsch@redhat.com Fabian Deutsch
            Geetika Kapoor Geetika Kapoor
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: