Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-37651

pod network availability poller should note if openshift-tests version is mismatched

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.17
    • Test Framework
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      pod-network-availability tests don't pass on hypershift, at least not ROSA HCP.  

       

      Some info here https://redhat-internal.slack.com/archives/C02LM9FABFW/p1719504170756849?thread_ts=1715974242.700929&cid=C02LM9FABFW

       

      Reproducer:

      • Ask cluster-bot for an rosa cluster (rosa create 4.15 6h)
      • Run some tests like the below, and notice pod-network-availability monitor collection fails
      $ TMPDIR=$(mktemp -d)
      $ cd $TMPDIR
      $ oc create serviceaccount cni-conformance -n default
      $ oc adm policy add-cluster-role-to-user cluster-admin -z cni-conformance -n default
      $ KUBECONFIG=$(pwd)/kubeconfig.yaml oc login --token="$(oc create token cni-conformance)" --server=$(oc config view --minify --output jsonpath="{.clusters[*].cluster.server}") --insecure-skip-tls-verify
      $ oc adm release info --image-for=tests registry.ci.openshift.org/ocp/release:4.16.0-0.nightly-2024-07-27-075008
      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:398a8756d59ff44f730bd2dd6b62e57ab2507b762442aba34b817d36e76f86e2
      $ podman run --authfile=$HOME/pull.json -v "$(pwd):/data:z" -w /data --rm -it quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:398a8756d59ff44f730bd2dd6b62e57ab2507b762442aba34b817d36e76f86e2 sh -c "KUBECONFIG=/data/kubeconfig.yaml /usr/bin/openshift-tests run openshift/network/third-party -o /data/results.txt"

      Logs:

      <testcase name="[sig-network] can collect pod-to-service poller pod logs" time="0">
          <failure message="">
              2 pods lacked sampler output: [pod-network-to-service-disruption-poller-7dfc77c96d-5lfkm, pod-network-to-service-disruption-poller-7dfc77c96d-dl5np]
          </failure>
          <system-out>
              &#xA;&#xA;Logs for -n e2e-pod-network-disruption-test-zvv8h pod/pod-network-to-service-disruption-poller-7dfc77c96d-5lfkm&#xA;
              Initializing to watch clusterIP 172.30.190.103:80&#xA;
              Initializing to watch clusterIP 172.30.190.103:80&#xA;
              Watching configmaps...&#xA;
              I0627 15:35:34.507299 1 service_controller.go:168] "Starting PollService controller"&#xA;
              I0627 15:35:34.507468 1 shared_informer.go:311] Waiting for caches to sync for ServicePoller&#xA;
              I0627 15:35:34.608352 1 shared_informer.go:318] Caches are synced for ServicePoller&#xA;
              Adding and starting: http://172.30.190.103:80 on node/ip-10-0-1-228.ec2.internal&#xA;
              Successfully started: http://172.30.190.103:80 on node/ip-10-0-1-228.ec2.internal&#xA;
              Stopping and removing: 172.30.190.103 for node/ip-10-0-1-228.ec2.internal&#xA;
              waiting for consumer to finish {Disruption map[backend-disruption-name:pod-to-service-new-connections connection:new disruption:pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103]}...&#xA;
              {"level":"Info","locator":"backend-disruption-name/pod-to-service-new-connections connection/new disruption/pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103","message":"backend-disruption-name/pod-to-service-new-connections connection/new disruption/pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103 started responding to GET requests over new connections","tempStructuredLocator":{"type":"","keys":null},"tempStructuredMessage":{"reason":"","cause":"","humanMessage":"","annotations":null},"from":"2024-06-27T15:35:34Z","to":"2024-06-27T15:41:25Z"}&#xA;
              consumer finished {Disruption map[backend-disruption-name:pod-to-service-new-connections connection:new disruption:pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103]}&#xA;
              waiting for consumer to finish {Disruption map[backend-disruption-name:pod-to-service-reused-connections connection:reused disruption:pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103]}...&#xA;
              {"level":"Info","locator":"backend-disruption-name/pod-to-service-reused-connections connection/reused disruption/pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103","message":"backend-disruption-name/pod-to-service-reused-connections connection/reused disruption/pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103 started responding to GET requests over reused connections","tempStructuredLocator":{"type":"","keys":null},"tempStructuredMessage":{"reason":"","cause":"","humanMessage":"","annotations":null},"from":"2024-06-27T15:35:34Z","to":"2024-06-27T15:41:25Z"}&#xA;
              consumer finished {Disruption map[backend-disruption-name:pod-to-service-reused-connections connection:reused disruption:pod-to-service-to-service-from-node-ip-10-0-1-228.ec2.internal-to-clusterIP-172.30.190.103]}&#xA;
              Stopped all watchers&#xA;
              E0627 15:41:25.629887 1 disruption_backend_sampler.go:496] not finished writing all samples (1 remaining), but we're told to close&#xA;
              E0627 15:41:25.629981 1 disruption_backend_sampler.go:496] not finished writing all samples (1 remaining), but we're told to close&#xA;&#xA;&#xA;
              
              Logs for -n e2e-pod-network-disruption-test-zvv8h pod/pod-network-to-service-disruption-poller-7dfc77c96d-dl5np&#xA;
              Initializing to watch clusterIP 172.30.190.103:80&#xA;
              Initializing to watch clusterIP 172.30.190.103:80&#xA;
              Watching configmaps...&#xA;
              I0627 15:35:34.166675 1 service_controller.go:168] "Starting PollService controller"&#xA;
              I0627 15:35:34.166788 1 shared_informer.go:311] Waiting for caches to sync for ServicePoller&#xA;
              Adding and starting: http://172.30.190.103:80 on node/ip-10-0-1-176.ec2.internal&#xA;
              Successfully started: http://172.30.190.103:80 on node/ip-10-0-1-176.ec2.internal&#xA;
              I0627 15:35:34.267207 1 shared_informer.go:318] Caches are synced for ServicePoller&#xA;
              Stopping and removing: 172.30.190.103 for node/ip-10-0-1-176.ec2.internal&#xA;
              waiting for consumer to finish {Disruption map[backend-disruption-name:pod-to-service-new-connections connection:new disruption:pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103]}...&#xA;
              {"level":"Info","locator":"backend-disruption-name/pod-to-service-new-connections connection/new disruption/pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103","message":"backend-disruption-name/pod-to-service-new-connections connection/new disruption/pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103 started responding to GET requests over new connections","tempStructuredLocator":{"type":"","keys":null},"tempStructuredMessage":{"reason":"","cause":"","humanMessage":"","annotations":null},"from":"2024-06-27T15:35:34Z","to":"2024-06-27T15:41:26Z"}&#xA;
              consumer finished {Disruption map[backend-disruption-name:pod-to-service-new-connections connection:new disruption:pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103]}&#xA;
              waiting for consumer to finish {Disruption map[backend-disruption-name:pod-to-service-reused-connections connection:reused disruption:pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103]}...&#xA;
              {"level":"Info","locator":"backend-disruption-name/pod-to-service-reused-connections connection/reused disruption/pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103","message":"backend-disruption-name/pod-to-service-reused-connections connection/reused disruption/pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103 started responding to GET requests over reused connections","tempStructuredLocator":{"type":"","keys":null},"tempStructuredMessage":{"reason":"","cause":"","humanMessage":"","annotations":null},"from":"2024-06-27T15:35:34Z","to":"2024-06-27T15:41:26Z"}&#xA;
              consumer finished {Disruption map[backend-disruption-name:pod-to-service-reused-connections connection:reused disruption:pod-to-service-to-service-from-node-ip-10-0-1-176.ec2.internal-to-clusterIP-172.30.190.103]}&#xA;
              Stopped all watchers&#xA;
          </system-out>
      </testcase> 

              stbenjam Stephen Benjamin
              stbenjam Stephen Benjamin
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: