Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: Pipelines as Code
Labels:
- konflux

Story Points:
2
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Release Note Text:

Hide
Before this update, if Pipelines as Code successfully created a PipelineRun but then failed during the subsequent patch operation due to a transient Kubernetes API error, it would incorrectly report this as a fatal CI failure and create a "failed" check run. This behavior has been fixed. The controller will now log the patch error and retry the operation.

Show
Before this update, if Pipelines as Code successfully created a PipelineRun but then failed during the subsequent patch operation due to a transient Kubernetes API error, it would incorrectly report this as a fatal CI failure and create a "failed" check run. This behavior has been fixed. The controller will now log the patch error and retry the operation.
Release Note Type:
Bug Fix
Release Note Status:
Done
Git Pull Request:
https://github.com/openshift-pipelines/pipelines-as-code/pull/2338
Intelligence Requested:
Market:

Sprint:
Pipelines Sprint CrookShank 43

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Problem Statement: When a PipelineRun is successfully created by the PaC controller, the controller immediately attempts to patch that PipelineRun (e.g., to add labels or annotations). If this subsequent patch operation fails due to a transient Kubernetes API server issue (e.g., temporary network error, API server unavailability, or admission webhook failure), the PaC controller is incorrectly treating this patch failure as a fatal CI failure.

This causes PaC to immediately post a failed check run (e.g., on GitHub) or a failed commit status (e.g., on GitLab, as seen in the "Konflux Production Internal" check).

This behavior is incorrect. A failure in the controller's ability to patch metadata should not be reported to the user as a failure of their CI job, especially since the PipelineRun object was successfully created and is likely executing (or about to execute) in the cluster.

Steps to Reproduce

A user triggers a PipelineRun via a pull request or push event.

The PaC controller successfully creates the PipelineRun resource in the cluster.

The PaC controller immediately attempts to patch the newly created PipelineRun to add metadata (e.g., pipelinesascode.tekton.dev/state: "started").

This patch call fails for any transient reason (e.g., a momentary API server disconnection, a webhook timeout, or any other k8s server-side issue).

Observe the commit status on the Git provider.

Actual Result

A failed check run is created on the Git provider for the commit.

This gives the user a "false negative," making them believe their PipelineRun or code is broken.

Expected Result

The controller should log the patch failure as an error (e.g., level=error msg="failed to patch pipelinerun XYZ: ...").

The controller should re-enqueue the PipelineRun and retry the patch operation according to its standard reconciliation loop.

No failed check run should be created. The PipelineRun should be allowed to run, and its actual outcome (success or failure) should be the only thing reported as a check run.

Out of Scope

This ticket is not to fix the underlying Kubernetes API server issues that may be causing the patch to fail. The fix is to make the PaC controller resilient to those failures.

This ticket does not involve changing the controller's retry or backoff logic, only ensuring that a patch failure correctly uses the existing retry mechanism instead of being treated as a fatal error.

This ticket does not change the logic for how PaC reports PipelineRuns that run and then genuinely fail (e.g., a test step fails). That remains unchanged.

Acceptance Criteria

When a PipelineRun is successfully created, a subsequent failure to patch it must not result in a "failed" check run being sent to the Git provider.

The final check run status (e.g., "success" or "failure") reported to the Git provider must reflect the actual terminal state of the PipelineRun itself, not the transient patch error.

Creating a "failed" check run is still the correct behavior if the PipelineRun fails to be created in the first place (e.g., a validation error). This functionality must not be broken.

Assignee:: Zaki Shaikh

Reporter:: Zaki Shaikh

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/11/18 9:59 AM

Updated:: 2025/12/01 1:49 PM

Resolved:: 2025/12/01 1:49 PM

Details

Description

Steps to Reproduce

Actual Result

Expected Result

Out of Scope

Acceptance Criteria

Attachments

Easy Agile Planning Poker

Activity

People

Dates