Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-44757

[sig-node] node-lifecycle detects unexpected not ready node failing on azure serial and upgrade jobs

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Can't Do
    • Icon: Major Major
    • None
    • 4.18
    • None
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Surfaces in tests:
      [sig-node] node-lifecycle detects unexpected not ready node
      [sig-node] node-lifecycle detects unreachable state on node

      https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-techpreview-serial/1858341879754526720

      In this job, we see:

      node/ci-op-jxk5kmn5-d8a6e-dgknq-master-2 - reason/UnexpectedUnreachable unexpected node unreachable at from: 2024-11-18 05:39:24.934825905 +0000 UTC m=+4531.474291883 - to: 2024-11-18 05:39:24.934825905 +0000 UTC m=+4531.474291883}
       
      15 seconds later we also see an event about api unreachable.

      We see similar problems in https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-upgrade/1859160753831940096

      In this job, we see:

      node/ci-op-c53845j5-431b2-9596s-worker-centralus2-2k8sd - reason/UnexpectedNotReady unexpected node not ready at from: 2024-11-20 10:50:50.757173918 +0000 UTC m=+601.172900086 - to: 2024-11-20 10:50:50.757173918 +0000 UTC m=+601.172900086}

      And at 10:50, we see kubelet lease failures and apiserver unavailability. We would like to narrow down the issue between kubelet, apiserver and the load-balancer.

              bbennett@redhat.com Ben Bennett
              rh-ee-kehannon Kevin Hannon
              Cameron Meadors Cameron Meadors
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: