-
Bug
-
Resolution: Unresolved
-
Normal
-
Pipelines 1.18.0
-
3
-
False
-
-
False
-
None, upstream only
-
Release Note Not Required
-
-
Description of problem:
The E2E test TestGithubPushRequestGitOpsCommentCancel in the GitHub push/cancel flow was experiencing intermittent failures due to timing and race conditions.
Root Cause:
- Repository status was checked after CR was deleted during test teardown
- Overly broad regex pattern could match incorrect PipelineRuns
- Race condition when PipelineRun completed before cancel comment was processed
Workaround: Already fixed in commit cfdd3428e
Prerequisites (if any, like setup, operators/versions):
- E2E test environment with GitHub integration
- Pipelines-as-Code controller running
- GitHub App configured for E2E tests
Steps to Reproduce
- Run make test-e2e TEST_ARGS="-run TestGithubPushRequestGitOpsCommentCancel"
- Observe intermittent failures
- Error message: "neither a cancelled pipelinerun in repo status or a request to skip the cancellation in the controller log was found"
Actual results:
Test fails intermittently with error about not finding cancelled PipelineRun status or skip message in logs. Failure rate depends on timing of PipelineRun completion vs cancellation comment processing.
Expected results:
Test should pass consistently regardless of race condition timing, properly handling both scenarios:
- PipelineRun gets cancelled successfully
- PipelineRun completes before cancel is processed (skip message logged)
Reproducibility (Always/Intermittent/Only Once):
Intermittent - Depends on race condition timing
Acceptance criteria:
- Test waits for specific PipelineRun (not generic match)
- Repository status verified before test teardown
- Regex pattern specific to actual PipelineRun name/namespace
- Test handles both fast and normal cancellation scenarios
- Enhanced logging for debugging
Definition of Done:
✓ Fix committed in cfdd3428e
✓ Test now tracks specific PipelineRun via annotations
✓ Uses UntilPipelineRunHasReason for precise waiting
✓ Regex pattern includes actual namespace/PR name
✓ Repository status checked before cleanup
✓ Logging enhanced with original PipelineRun names
Build Details:
- Branch: investigaste-e2e-failure
- Commit: cfdd3428e
- Files modified:
**test/github_push_retest_test.go (+56/-26 lines)- pkg/pipelineascode/cancel_pipelineruns.go (+3/-1 lines)
Additional info (Such as Logs, Screenshots, etc):
Investigation Summary:
The test had two validation paths: check Repository status for Cancelled condition OR find skip message in logs. However:
- Repository CR was deleted during NSTearDown before status check
- Regex pattern .pipelinerun.*skipping cancelling pipelinerun.*on-push.*already done. matched ANY pipelinerun with "on-push"
- No verification that the correct PipelineRun was being checked
Fix Details:
- Identify specific PipelineRun using keys.OriginalPRName and keys.EventType annotations
- Wait for cancellation using existing UntilPipelineRunHasReason helper
- Use precise regex: .skipping cancelling pipelinerun %s/%s.*already done. with actual namespace/name
- Verify status BEFORE teardown
- Increase log capture from 20 to 100 lines
Related commit: cfdd3428e - "test: Fix flaky TestGithubPushRequestGitOpsCommentCancel E2E test"