-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
None
-
None
-
None
Description of problem:
The CNCF tests are reliably failing but there is not yet an indicator as to why. The sonobuoy process executes and it's sub-tasks (1 per host) register completion. However the global task remains in a "running" state until prow eventually kills the job.
Actual results:
Sonobuoy output: { "plugins": [ { "plugin": "e2e", "node": "global", "status": "running", "result-status": "", "result-counts": null, "progress": { "name": "e2e", "node": "global", "timestamp": "2025-03-18T03:25:49.305582478Z", "msg": "", "total": 404, "completed": 0 } }, { "plugin": "systemd-logs", "node": "el94-src-cncf-conformance-host1", "status": "complete", "result-status": "", "result-counts": null }, { "plugin": "systemd-logs", "node": "el94-src-cncf-conformance-host2", "status": "complete", "result-status": "", "result-counts": null } ], "status": "running", "tar-info": { "name": "", "created": "0001-01-01T00:00:00Z", "sha256": "", "size": 0 } }
Sonobuoy is run inside a pod. The logs don't indicate any errors. Below is a log line that recurs throughout the CNCF failures:
2025-03-18T01:25:45.087568043-04:00 stdout F Plugin is complete. Sleeping indefinitely to avoid container exit and automatic restarts from Kubernetes
Here is the same log line in a bit more context:
2025-03-17T23:25:47.776819804-04:00 stderr F time="2025-03-18T03:25:47Z" level=trace msg="Invoked command single-node with args [] and flags [level=trace logtostderr=true sleep=-1 v=6]" 2025-03-17T23:25:47.777429876-04:00 stderr F time="2025-03-18T03:25:47Z" level=info msg="Waiting for waitfile" waitfile=/tmp/sonobuoy/results/done 2025-03-17T23:25:47.777529041-04:00 stderr F time="2025-03-18T03:25:47Z" level=info msg="Starting to listen on port 8099 for progress updates and will relay them to https://[10.42.1.6]:8080/api/v1/progress/by-node/el94-src-cncf-conformance-host1/systemd-logs" 2025-03-17T23:25:48.777653263-04:00 stderr F time="2025-03-18T03:25:48Z" level=trace msg="Detected done file but sleeping for 5s then checking again for file. This allows other containers to intervene if desired." 2025-03-17T23:25:53.778970966-04:00 stderr F time="2025-03-18T03:25:53Z" level=info msg="Detected done file, transmitting result file" resultFile=/tmp/sonobuoy/results/systemd_logs 2025-03-17T23:25:53.810797001-04:00 stderr F time="2025-03-18T03:25:53Z" level=info msg="Results transmitted to aggregator. Sleeping forever."
- is related to
-
USHIFT-5588 CNCF tests are timing out (sometimes with 'Test Suite starting' msg)
-
- Closed
-