-
Bug
-
Resolution: Unresolved
-
Critical
-
4.19.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
None
-
None
-
Rejected
-
None
-
Done
-
Release Note Not Required
-
N/A
-
None
-
None
-
None
-
None
Description of problem:
The e2e-aws-ovn-ipsec-upgrade CI lane is not passing 100%, mostly api server pod failing to connect with a metric api server endpoint for some period at the time of upgrade. It seems like pod to pod connectivity issue between two nodes in particular, at the time of node reboots, journal, pluto logs seem to be clean. Need to investigate if there is any missing ip xfrm state and policy for that period or libreswan 5.2 bump regressing this problem. : [sig-instrumentation] disruption/metrics-api connection/new should be available throughout the test expand_less0s{ backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests was unreachable during disruption: for at least 1m3s (maxAllowed=6s): P99 from historical data for similar jobs over past 3 weeks: 0s rounded P99 up to always allow one second added an additional 5s of grace Apr 22 23:18:29.808 - 1s E backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests reason/DisruptionBegan request-audit-id/0c1fe229-54bc-4a00-9938-339b49dd08f6 backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests stopped responding to GET requests over new connections: error running request: 503 Service Unavailable: error trying to reach service: context deadline exceeded Apr 22 23:18:38.809 - 999ms E backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests reason/DisruptionBegan request-audit-id/e729d243-f23f-4d85-9052-f500e75e4e9c backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests stopped responding to GET requests over new connections: error running request: 503 Service Unavailable: error trying to reach service: context deadline exceeded .... Apr 22 23:23:22.808 - 1s E backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests reason/DisruptionBegan request-audit-id/20ea7c5c-bb98-499f-8856-58462e2dbdbf backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests stopped responding to GET requests over new connections: error running request: 503 Service Unavailable: error trying to reach service: context deadline exceeded Apr 22 23:23:33.809 - 2s E backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests reason/DisruptionBegan request-audit-id/b4eb2400-79ec-40aa-b840-8af28f42bebf backend-disruption-name/metrics-api-new-connections connection/new disruption/openshift-tests stopped responding to GET requests over new connections: error running request: 503 Service Unavailable: error trying to reach service: context deadline exceeded}
Version-Release number of selected component (if applicable):
https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-network-operator/2573/pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-ipsec-upgrade/1914778546035757056 https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_cluster-network-operator/2674/pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-ipsec-upgrade/1914778428616216576 https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/63667/rehearse-63667-periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-ipsec-upgrade/1911787277483249664 https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/63904/rehearse-63904-periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-ipsec-upgrade/1912386358903574528
How reproducible:
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info:
- blocks
-
OCPBUGS-55809 e2e-aws-ovn-ipsec-upgrade job is failing with disruptive events
-
- Closed
-
- is cloned by
-
OCPBUGS-55809 e2e-aws-ovn-ipsec-upgrade job is failing with disruptive events
-
- Closed
-
- links to