Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: 4.19
Component/s: HyperShift
Labels:
- issue-for-agent
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

The control-plane-pki-operator incorrectly reports certificate revoked condition.  it does not wait for each individual kube-apiserver pod to reload their CA bundles, therefore a certificate may be used for a short period of time, even after the pki-operator reports it unusable.  

Typically this is on the order of ~minutes.

Version-Release number of selected component (if applicable):

How reproducible:

    Race condition, so intermittently

Steps to Reproduce:

    1. Create an HA control plane
    2. Request a breakglass certificate,
    3. Create a certificate revocation request, wait until the CRR condition for PreviousCertificateRevoked is true
    4. Attempt to use the certificate to authenticate to the KAS, it will intermittently succeed until the CA bundles reload on the KAS, depending on the routing.

Actual results:

    Certificate is valid intermittently until the CA bundles are reloaded on the KAS

Expected results:

    When the status is reported, the certificate is no longer valid across all KAS instances.

Additional info:

    Logs for this scenario: https://drive.google.com/file/d/1_9U5lQh8Ejb36VqFhUbTSIYqusOfueqZ/view?usp=drive_link

Acceptance criteria:
Certificate revoked condition is true when:
a) enumerate the API servers
b) create kubeconfig for each
c) if the connection matches the existing logic, move on to the next one
d) if not, reuse existing logic for requeue

links to

openshift/hypershift#6926: OCPBUGS-62177: Fix certificate revocation validation across all KAS pods

Assignee:: Salvatore Dario Minonne

Reporter:: Ben Vesel

Need Info From:: None

Contributors:: None

QA Contact:: Jie Zhao

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/09/24 5:55 PM

Updated:: 2025/10/06 2:21 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates