Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-55667

aws/edge failing e2e test: [sig-network][Feature:tap] should create a pod with a tap interface

XMLWordPrintable

      This is a clone of issue OCPBUGS-54700. The following is the description of the original issue:

      Description of problem:

      the following e2e test is permanent failing in AWS edge zones: "[sig-network][Feature:tap] should create a pod with a tap interface" affecting CI jobs, specially for edge zones[0].
      
      This is also observed failure in partner ecosystem while validating openshift clusters on platform type External[1].
      
      The common setup is both environments have the first worker node in the list with NoSchedule taints, and the test is ignoring it, never being scheduled in the hard coded[1] worker node.

      Version-Release number of selected component (if applicable):

          4.15+

      How reproducible:

          Always in AWS edge zone installations - or any cluster with the first worker node in the list with NoSchedule taints.

      Steps to Reproduce:

          1. install a cluster on AWS
          2. apply a NoSchedule taint to the first worker node in the list
          3. run the e2e test "[sig-network][Feature:tap] should create a pod with a tap interface [apigroup:k8s.cni.cncf.io] [Suite:openshift/conformance/parallel]"

      Actual results:

      The job will not be scheduled:
                      lastTransitionTime: "2025-04-06T01:47:26Z"
                      message: '0/8 nodes are available: 2 node(s) had untolerated taint {node-role.kubernetes.io/edge:
                        }, 3 node(s) didn''t match Pod''s node affinity/selector, 3 node(s) had untolerated
                        taint {node-role.kubernetes.io/master: }. preemption: 0/8 nodes are available:
                        8 Preemption is not helpful for scheduling..'
      
      blob:https://prow.ci.openshift.org/2f574ff0-21d1-410d-98d7-f7352641b44f    

      Expected results:

          The test selects nodes which is able to be scheduled, instead the first in the list.

      Additional info:

      [0] permanent aws-edge job failing due single test with static worker selection logi:
      https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-shared-vpc-edge-zones/1908671470251282432
      
      https://prow.ci.openshift.org/job-history/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-shared-vpc-edge-zones?buildId=
      
      [1] experiments in partner enablement with the root cause: https://issues.redhat.com/browse/OPCT-277
      
      More CI failures: https://search.dptools.openshift.org/?search=should+create+a+pod+with+a+tap+interface&maxAge=48h&context=1&type=bug%2Bissue%2Bjunit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job 

              rhn-support-mrbraga Marco Braga
              openshift-crt-jira-prow OpenShift Prow Bot
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: