-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
3
-
False
-
-
False
-
-
Description of problem:
We are observing intermittent failures of Tekton PipelineRuns where the PipelineRun transitions to Failed with the message:
Tasks Completed: 4 (Failed: 1, Cancelled 0), Skipped: 14
even though all completed TaskRuns show Succeeded, and no task is explicitly cancelled.
This issue has been reproduced multiple times in the openstack-tenant namespace under high concurrency and appears to be a controller-side race condition rather than a user configuration or task failure.
Prerequisites (if any, like setup, operators/versions):
Actual results:
- PipelineRun transitions to Failed while a TaskRun is still Running
- Remaining tasks are skipped automatically
- Failure reason does not correspond to any real TaskRun failure
Expected results:
- PipelineRun should remain in Running state while any TaskRun is still running
- Downstream tasks should not be skipped unless:
-
- A TaskRun has explicitly failed
-
- PipelineRun is explicitly cancelled or timed out
Reproducibility (Always/Intermittent/Only Once):
yes
Suspected Root Cause
A race condition in the Tekton Pipelines controller during reconciliation under high concurrency, where a transient or stale TaskRun state is incorrectly interpreted as a failure, triggering PipelineRun stopping logic prematurely.
Acceptance criteria:
Definition of Done:
Build Details:
Additional info (Such as Logs, Screenshots, etc):
Slack Thread - https://redhat-internal.slack.com/archives/C04PZ7H0VA8/p1768478363451539