-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
None
-
None
-
None
-
False
-
-
False
-
-
-
Important
-
None
Description of problem:
We have observed multiple occurrences where the ACM policy responsible for syncing ingress certificates to data clusters has failed to update the certificate.
This has resulted in expired certificates causing login issues to OpenShift console instances.
- OHSS-42049
- OHSS-42076
Version-Release number of selected component (if applicable):
One cluster is 4.14 ~ and another is 4.17 But we think this relate to the ACM of HCP
How reproducible:
Steps to Reproduce:
- Confirm the default ingress controller secret has been expired on the Hosted cluster
$ ocm backplane elevate "OHSS-42076" -- get secret -n openshift-ingress 2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret -ojson | jq -r '.data."tls.crt"' | base64 --decode | openssl x509 -enddate -noout notAfter=Mar 12 13:49:20 2025 GM
2. Verify that the Service Cluster holds an updated certificate:
$ ocm backplane elevate "OHSS-42076" -- get secret 2875r5ve1gnl0r7fckl3po6i32rbeuq8 -n openshift-acm-policies -ojson | jq -r '.data."tls.crt"' | base64 --decode | openssl x509 -enddate -noout notAfter=May 11 12:52:22 2025 GMT
3. We expected the ACM policy will copy the updated certificate secret from Service Cluster to Management Cluster, but this seems not happened for the last 3 months.
Check related ocm policy status and get the following
$ oc get policies -n 2875r5ve1gnl0r7fckl3po6i32rbeuq8 openshift-acm-policies.rosa-ingress-certificate-policies -o yaml ...omitted... - compliant: Compliant history: - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.1810882dbb60ccbf lastTimestamp: "2024-12-12T20:33:06Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found as specified in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.18107be499fd9bf0 lastTimestamp: "2024-12-12T16:47:58Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found as specified in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.181075570e0fdba2 lastTimestamp: "2024-12-12T14:47:53Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] was updated successfully in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.181075570d0963de lastTimestamp: "2024-12-12T14:47:53Z" message: NonCompliant; violation - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found but not as specified in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.180752c49877b0ed lastTimestamp: "2024-11-12T20:33:06Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found as specified in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.17fe1d5c187e73c7 lastTimestamp: "2024-10-13T20:33:08Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found as specified in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.17fe143013808924 lastTimestamp: "2024-10-13T17:45:03Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found as specified in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.17fe0da27b55f555 lastTimestamp: "2024-10-13T15:44:58Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] was updated successfully in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.17fe0da27a423840 lastTimestamp: "2024-10-13T15:44:58Z" message: NonCompliant; violation - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found but not as specified in namespace openshift-ingress - eventName: openshift-acm-policies.rosa-ingress-certificate-policies.17f4e7f37b7c7b48 lastTimestamp: "2024-09-13T20:33:10Z" message: Compliant; notification - secrets [2875r5ve1gnl0r7fckl3po6i32rbeuq8-primary-cert-bundle-secret] found as specified in namespace openshift-ingress templateMeta: creationTimestamp: null name: rosa-ingress-certificate-policies
From the events history, it seems sync every month but we see no events after "2024-12-12T20:33:06Z"
Expected results:
- ACM should automatically push the updated certificate from the ServiceCluster to the Management Cluster, then sync to Hosted Cluster every month as expected.
- Ingress certificates should not expire
We're worried about there are many other cluster might be affected as well and the ingress certificates did not renewed. We need HCP ACM help to identify the potential bug and provide a solution/ temporary workaround we could apply to avoid severe incidents.
Thank you in advanced
Additional info:
Slack: https://redhat-internal.slack.com/archives/C04EUL1DRHC/p1741708441933359
- depends on
-
ACM-17811 Investigate a potential bug in MCE when a hosted cluster cert is regenerated
-
- Closed
-