Loading...

XML

Word

Printable

Type: Bug
Resolution: Won't Do
Priority: Minor
Fix Version/s: None
Affects Version/s: 4.14
Component/s: Networking / ovn-kubernetes
Labels:
- SDN-Bug-Backlog-Reduction-Lack-Of-Team-Cycles
- ipsec

Severity:
Important
Regression:
No
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Type:
Known Issue

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:

Description of problem:

After enabling and disabling ipsec during runtime., the cluster is not in the health state any more with Error while reconciling 4.14.0-0.nightly-2023-09-15-233408: an unknown error has occurred: MultipleErrors

Version-Release number of selected component (if applicable):

4.14.0-0.nightly-2023-09-15-233408

How reproducible:

Most times

Steps to Reproduce:

1.Install a GCP cluster without ipsec
2.Enable ipsec in the cluster
oc patch networks.operator.openshift.io cluster --type=merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipsecConfig":{ }}}}}'
3.Create some testing pods
4.Disable ipsec in the cluster 
oc patch networks.operator.openshift.io cluster --type=merge -p '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"ipsecConfig":null}}}}'

Actual results:

# From 4.14
[weliang@weliang Test]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.14.0-0.nightly-2023-09-15-233408   True        False         5h8m    Error while reconciling 4.14.0-0.nightly-2023-09-15-233408: an unknown error has occurred: MultipleErrors
[weliang@weliang Test]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.14.0-0.nightly-2023-09-15-233408   True        False         5h8m    Error while reconciling 4.14.0-0.nightly-2023-09-15-233408: an unknown error has occurred: MultipleErrors
[weliang@weliang Test]$ oc get co   --no-headers | grep -v '.True.*False.*False'  
authentication                             4.14.0-0.nightly-2023-09-15-233408   True    False   True    12s     OAuthServerServiceEndpointsEndpointAccessibleControllerDegraded: Get "https://10.128.0.59:6443/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
console                                    4.14.0-0.nightly-2023-09-15-233408   False   False   False   20s     RouteHealthAvailable: console route is not admitted
monitoring                                 4.14.0-0.nightly-2023-09-15-233408   False   True    True    2m44s   reconciling Thanos Querier Route failed: updating Route object failed: the server is currently unable to handle the request (put routes.route.openshift.io thanos-querier), deleting UserWorkload federate Route failed: the server is currently unable to handle the request (delete routes.route.openshift.io federate), reconciling Prometheus Federate Route failed: updating Route object failed: the server is currently unable to handle the request (put routes.route.openshift.io prometheus-k8s-federate)
[weliang@weliang Test]$ 

## From 4.13
[weliang@weliang verification-tests]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.13.13   True        False         5h11m   Error while reconciling 4.13.13: an unknown error has occurred: MultipleErrors
[weliang@weliang verification-tests]$ oc get co   --no-headers | grep -v '.True.*False.*False'  
authentication                             4.13.13   True    False   True    29s     OAuthServerServiceEndpointsEndpointAccessibleControllerDegraded: Get "https://10.128.0.40:6443/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
monitoring                                 4.13.13   False   True    True    25s     reconciling Alertmanager Route failed: updating Route object failed: the server is currently unable to handle the request (put routes.route.openshift.io alertmanager-main), deleting UserWorkload federate Route failed: the server is currently unable to handle the request (delete routes.route.openshift.io federate), reconciling Prometheus Federate Route failed: updating Route object failed: the server is currently unable to handle the request (put routes.route.openshift.io prometheus-k8s-federate)
[weliang@weliang verification-tests]$

Expected results:

Cluster should be in a heath state

Additional info:

The issue happened in both 4.14 and 4.13

must-gather 
https://people.redhat.com/~weliang/must-gather-4.14.tar.gz
https://people.redhat.com/~weliang/must-gather-4.13.tar.gz

Assignee:: Yuval Kashtan

Reporter:: Weibin Liang

QA Contact:: Weibin Liang

Need Info From:: Weibin Liang

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2023/09/19 6:45 PM

Updated:: 2024/01/26 11:08 AM

Resolved:: 2024/01/24 5:36 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates