XML

Word

Printable

Type: Epic
Resolution: Done
Priority: Critical
Fix Version/s: CNV v4.13.0
Affects Version/s: None
Component/s: CNV Infrastructure, CNV Install, Upgrade and Operators
Labels:

Epic Name:
compact-cluster-operations
Activity Type:
Product / Portfolio Work
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Acceptance Criteria:

Hide

1. Cover a 3 compact OCP cluster with CNV
2. Action items on the pitfalls (let it be docs, or code changes)

Show
1. Cover a 3 compact OCP cluster with CNV 2. Action items on the pitfalls (let it be docs, or code changes)
Current Status:
Yellow
Epic Status:
To Do
Feature Link:
VIRTSTRAT-305 - Fencing - Compact & FAR
Parent Link:
VIRTSTRAT-305Fencing - Compact & FAR
Hierarchy Progress Bar:

0% To Do, 0% In Progress, 100% Done
Ready-Ready:

dev-ready, doc-ready, po-ready, px-ready, qe-ready, ux-ready
Status Summary:

Hide

2023-04-17: more work than expected, might slip int 4.14, but does not block 4.13

Reasons for epic to be in yellow as we expect more test around these areas once they are fixed.

In a 3 node cluster, When node goes down, I see connectivi...

Show
2023-04-17: more work than expected, might slip int 4.14, but does not block 4.13 Reasons for epic to be in yellow as we expect more test around these areas once they are fixed. In a 3 node cluster, When node goes down, I see connectivi...

Target Version:

CNV v4.13.0

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Goal

Test NHC/SNR on Compact Cluster before our customers do.
We should start the testing as soon as the operator is available to us, even before released.

Identify pitfalls that arise in a compact cluster due to differences between control plane and worker nodes.
I.e.

NHC/SNR can not fence ctl plane nodes today
Any special considerations for networking between workers and ctl plane nodes?
Anything to consider for affinity due to the different node pools
Implications of the different node pools on update flow?
…

User Stories

As a RHV Cluster owner I want to run OCP with CNV on a similar BM footprint so that I do not need to get more or more expensive hardware.
As a RHV cluster owner I want to have HA for VMs on my compact cluster so that I get comparable functionality than RHV
As a RHV cluster owner I would like to minimize the downtime for any of my VMs in case a node failed
As a RHV cluster owner I would like to understand the different timeouts I can set, what are "safe" values and what are the risks if selecting timeouts that are lower than the "safe" ones
As a RHV cluster owner I would like to understand how to calculate the minimal values that are HW dependent

On a compact cluster with Node Remediation (poison pill) operator installed and shared storage:

As a VM owner I would like my VM restart on another node within the same amount of time it takes or non-compact cluster in case the node it's running on fails.

Non-Requirements

List of things not included in this epic, to alleviate any doubt raised during the grooming process.

Notes

Any additional details or decisions made/needed

Done Checklist

Who	What	Reference





QE	Test plans in Polarion	https://polarion.engineering.redhat.com/polarion/#/project/CNV/workitem?id=CNV-7092
QE	Automated tests merged	https://code.engineering.redhat.com/gerrit/c/cnv-tests/+/422251

is blocked by

CNV-26864 Watchdog unable to reboot node completely in combination to SNR/NHC

Planning

is depended on by

CNV-25645 Target additional testing for NHC and SNR with compact clusters

Closed

is related to

OCPBUGS-11277 Must gather is unable to collect all the data for Compact cluster if node is down

Closed

links to

Add fencing test cases for Nodehealthche

Node fencing and workload recovery delays

Assignee:: Dominik Holler

Reporter:: Fabian Deutsch

QA Contact:: Geetika Kapoor

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2022/05/31 8:21 PM

Updated:: 2025/08/07 8:50 PM

Resolved:: 2023/05/15 8:55 AM

Details

Description

Goal

User Stories

Non-Requirements

Notes

Done Checklist

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates