Loading...

XML

Word

Printable

Type: Spike
Resolution: Done
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- UpgradeBlocker

Blocked:
False
Blocked Reason:
None
Ready:
False
[QE] How to address?:
---
Intelligence Requested:
Market:

Cost of Delay:
0
WSJF:
0

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

We're asking the following questions to evaluate whether or not ~~OCPBUGS-22293~~ warrants changing update recommendations from either the previous X.Y or X.Y.Z. The ultimate goal is to avoid recommending an update which introduces new risk or reduces cluster functionality in any way. In the absence of a declared update risk (the status quo), there is some risk that the existing fleet updates into the at-risk releases. Depending on the bug and estimated risk, leaving the update risk undeclared may be acceptable.

Which 4.y.z to 4.y'.z' updates increase vulnerability?

clusters that have upgraded from 4.10->4.11 will be vulnerable and will be affected by this issue when/if they
eventually upgrade to 4.12.41+, 4.13.16+ or 4.14+

Which types of clusters?

clusters using the OVNKubernetes CNI and that have previously been upgraded from 4.10
to 4.11
clusters that were initially installed with 4.11 or newer are not affected.
no reports of this issue for clusters using the OpenShiftSDN CNI are known, although specific
testing has not been done at this point

What is the impact? Is it serious enough to warrant removing update recommendations?

upgrading clusters that are susceptible to this issue will be stuck in the network operator rollout and
the only known solution to allow upgrades to progress is a two step manual process.

this will occur on all versions beyond those that introduced the changes.

How involved is remediation?

resolving this is a two step manual process:

edit the ovnkube-master daemonset and remove the "lifecycle" section from the ovnkube-master
container (which only includes the preStop hook)
'oc rollout restart deployment cluster-version-operator -n openshift-cluster-version'

this can be done as a pre-upgrade step instead of waiting for the upgrade to be stuck:

# mark the network operator as unmanaged and remove the preStop hooks for the 'sbdb' and 'nbdb' containers:

  oc patch Network.operator.openshift.io cluster --type='merge'  -p='{"spec":{"managementState":"Unmanaged"}}'

  oc patch daemonset -nopenshift-ovn-kubernetes ovnkube-master --type='json' -p='[{"op": "remove", "path": "/spec/template/spec/containers/1/lifecycle/preStop"}, {"op": "remove", "path": "/spec/template/spec/containers/3/lifecycle/preStop"}]'


# this will cause the network operator to update. wait for it to rollout

  oc wait co network --for='condition=PROGRESSING=False' --timeout=600s

initiate the upgrade after this. Upon completion the network operator will be moved back
to "Managed" automatically as part of the upgrade.

Is this a regression?

Yes.

is related to

OCPBUGS-22293 [4.13] CNO fails to apply ovnkube-master daemonset during upgrade

Closed

links to

openshift/cincinnati-graph-data#4377: SDN-4196: Set `OVNKubeMasterDNPrestop` on OVN clusters installed before 4.11

openshift/cincinnati-graph-data#4379: SDN-4196: blocked-edges/*-OVNKubeMasterDSPrestop: Precise 'from' expressions

openshift/cluster-network-operator#2094: [release-4.13] OCPBUGS-22293: remove all managed fields used by old manager

Assignee:: Jamo Luhrsen

Reporter:: Petr Muller

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Created:: 2023/11/06 5:26 PM

Updated:: 2024/01/30 2:24 PM

Resolved:: 2023/11/13 6:20 PM

Details

Description

Which 4.y.z to 4.y'.z' updates increase vulnerability?

Which types of clusters?

What is the impact? Is it serious enough to warrant removing update recommendations?

How involved is remediation?

Is this a regression?

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates