-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.17
-
Quality / Stability / Reliability
-
False
-
-
5
-
Moderate
-
None
-
None
-
None
-
MCO Sprint 259
-
1
-
Done
-
Release Note Not Required
-
N/A
-
None
-
None
-
None
-
None
This is a clone of issue OCPBUGS-38632. The following is the description of the original issue:
—
Description of problem:
When we add a userCA bundle to a cluster that has MCPs with yum based rhel nodes, the MCP with rhel nodes are degraded.
Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.17.0-0.nightly-2024-08-18-131731 True False 101m Cluster version is 4.17.0-0.nightly-2024-08-18-131731
How reproducible:
Always
In the CI we found this issue running test case "[sig-mco] MCO security Author:sregidor-NonHyperShiftHOST-High-67660-MCS generates ignition configs with certs [Disruptive] [Serial]" on prow job periodic-ci-openshift-openshift-tests-private-release-4.17-amd64-nightly-gcp-ipi-workers-rhel8-fips-f28-destructive
Steps to Reproduce:
1. Create a certificate
$ openssl genrsa -out privateKey.pem 4096
$ openssl req -new -x509 -nodes -days 3600 -key privateKey.pem -out ca-bundle.crt -subj "/OU=MCO qe/CN=example.com"
2. Add the certificate to the cluster
# Create the configmap with the certificate
$ oc create cm cm-test-cert -n openshift-config --from-file=ca-bundle.crt
configmap/cm-test-cert created
#Configure the proxy with the new test certificate
$ oc patch proxy/cluster --type merge -p '{"spec": {"trustedCA": {"name": "cm-test-cert"}}}'
proxy.config.openshift.io/cluster patched
3. Check the MCP status and the MCD logs
Actual results:
The MCP is degraded
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-3251b00997d5f49171e70f7cf9b64776 True False False 3 3 3 0 130m
worker rendered-worker-05e7664fa4758a39f13a2b57708807f7 False True True 3 0 0 1 130m
We can see this message in the MCP
- lastTransitionTime: "2024-08-19T11:00:34Z"
message: 'Node ci-op-jr7hwqkk-48b44-6mcjk-rhel-1 is reporting: "could not apply
update: restarting coreos-update-ca-trust.service service failed. Error: error
running systemctl restart coreos-update-ca-trust.service: Failed to restart
coreos-update-ca-trust.service: Unit coreos-update-ca-trust.service not found.\n:
exit status 5"'
reason: 1 nodes are reporting degraded status on sync
status: "True"
type: NodeDegraded
In the MCD logs we can see:
I0819 11:38:55.089991 7239 update.go:2665] Removing SIGTERM protection
E0819 11:38:55.090067 7239 writer.go:226] Marking Degraded due to: could not apply update: restarting coreos-update-ca-trust.service service failed. Error: error running systemctl restart coreos-update-ca-trust.service: Failed to restart coreos-update-ca-trust.service: Unit coreos-update-ca-trust.service not found.
Expected results:
No degradation should happen. The certificate should be added without problems.
Additional info:
- clones
-
OCPBUGS-38632 MCPs with RHEL nodes are degraded when a userCA bundle is added to the cluster
-
- Closed
-
- is blocked by
-
OCPBUGS-38632 MCPs with RHEL nodes are degraded when a userCA bundle is added to the cluster
-
- Closed
-
- links to
-
RHBA-2024:7922
OpenShift Container Platform 4.17.z bug fix update