Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: 4.20
Affects Version/s: 4.15, 4.16, 4.17, 4.18, 4.19
Component/s: Documentation / CORENET
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
5
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
OSDOCS Sprint 272, OSDOCS Sprint 273
sprint_count:
2

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description:

This is not a functional bug but a documentation gap that leads to customer confusion and potential operational impact due to unexpected reboots.

During OpenShift Container Platform (OCP) upgrades on clusters with IPsec enabled, nodes may experience two consecutive reboots instead of the typical single reboot. This behavior has been observed when upgrading from OCP 4.16 to 4.17, specifically from 4.16.36 to 4.17.27, and is confirmed to be an expected behavior.

The first reboot is caused by the IPsec machine configs rolling out by the Cluster Network Operator (CNO), and the second reboot is caused by the Machine Config Operator (MCO) for rolling out the latest set of machine configs after its upgrade.

This double reboot behavior was also observed during the 4.14 to 4.15 upgrade (when IPsec moved to host-based), but not from 4.15 to 4.16. The presence of the double reboot depends on whether the libreswan and NetworkManager-libreswan versions in the target OCP release are different from the source OCP release, requiring a machine config update by the CNO.

Customers have expressed surprise at this behavior, as it was not always the case, and every node reboot is a significant event for some customers.

Proposed Resolution:

Update the OpenShift Container Platform documentation to clearly state that double node reboots are an expected behavior during upgrades for IPsec-enabled clusters, particularly when the libreswan and NetworkManager-libreswan packages are updated.

Specifically, add this information to the networking documentation under the IPsec configuration section. The suggested location is:

https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/networking/network-security#configuring-ipsec-ovn

This should be done in OCP documentation from 4.15 and up.

Workaround (to achieve a single reboot):

It is possible for users to pause the worker MachineConfigPools during the cluster upgrade and unpause them at the end. This allows the CNO and MCO changes to be applied together, resulting in only a single reboot for the worker nodes.

Additional Information:

A KCS article summarizing this behavior has been created: https://access.redhat.com/solutions/7124386

links to

openshift/openshift-docs#94844: OCPBUGS-57365: Documented IPSec node reboots

Assignee:: Darragh Fitzmaurice

Reporter:: David Coronel

Need Info From:: None

Contributors:: None

QA Contact:: Huiran Wang

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/06/12 4:22 AM

Updated:: 2025/07/15 10:35 AM

Resolved:: 2025/07/04 8:34 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates