-
Bug
-
Resolution: Can't Do
-
Major
-
None
-
4.18
-
None
-
None
-
Rejected
-
False
-
Surfaces in tests:
[sig-node] node-lifecycle detects unexpected not ready node
[sig-node] node-lifecycle detects unreachable state on node
In this job, we see:
node/ci-op-jxk5kmn5-d8a6e-dgknq-master-2 - reason/UnexpectedUnreachable unexpected node unreachable at from: 2024-11-18 05:39:24.934825905 +0000 UTC m=+4531.474291883 - to: 2024-11-18 05:39:24.934825905 +0000 UTC m=+4531.474291883}
15 seconds later we also see an event about api unreachable.
We see similar problems in https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-upgrade/1859160753831940096
In this job, we see:
node/ci-op-c53845j5-431b2-9596s-worker-centralus2-2k8sd - reason/UnexpectedNotReady unexpected node not ready at from: 2024-11-20 10:50:50.757173918 +0000 UTC m=+601.172900086 - to: 2024-11-20 10:50:50.757173918 +0000 UTC m=+601.172900086}
And at 10:50, we see kubelet lease failures and apiserver unavailability. We would like to narrow down the issue between kubelet, apiserver and the load-balancer.