-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.17
-
Moderate
-
None
-
5
-
MCO Sprint 259
-
1
-
False
-
-
Release Note Not Required
-
In Progress
Description of problem:
When we add a userCA bundle to a cluster that has MCPs with yum based rhel nodes, the MCP with rhel nodes are degraded.
Version-Release number of selected component (if applicable):
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.17.0-0.nightly-2024-08-18-131731 True False 101m Cluster version is 4.17.0-0.nightly-2024-08-18-131731
How reproducible:
Always In the CI we found this issue running test case "[sig-mco] MCO security Author:sregidor-NonHyperShiftHOST-High-67660-MCS generates ignition configs with certs [Disruptive] [Serial]" on prow job periodic-ci-openshift-openshift-tests-private-release-4.17-amd64-nightly-gcp-ipi-workers-rhel8-fips-f28-destructive
Steps to Reproduce:
1. Create a certificate $ openssl genrsa -out privateKey.pem 4096 $ openssl req -new -x509 -nodes -days 3600 -key privateKey.pem -out ca-bundle.crt -subj "/OU=MCO qe/CN=example.com" 2. Add the certificate to the cluster # Create the configmap with the certificate $ oc create cm cm-test-cert -n openshift-config --from-file=ca-bundle.crt configmap/cm-test-cert created #Configure the proxy with the new test certificate $ oc patch proxy/cluster --type merge -p '{"spec": {"trustedCA": {"name": "cm-test-cert"}}}' proxy.config.openshift.io/cluster patched 3. Check the MCP status and the MCD logs
Actual results:
The MCP is degraded $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-3251b00997d5f49171e70f7cf9b64776 True False False 3 3 3 0 130m worker rendered-worker-05e7664fa4758a39f13a2b57708807f7 False True True 3 0 0 1 130m We can see this message in the MCP - lastTransitionTime: "2024-08-19T11:00:34Z" message: 'Node ci-op-jr7hwqkk-48b44-6mcjk-rhel-1 is reporting: "could not apply update: restarting coreos-update-ca-trust.service service failed. Error: error running systemctl restart coreos-update-ca-trust.service: Failed to restart coreos-update-ca-trust.service: Unit coreos-update-ca-trust.service not found.\n: exit status 5"' reason: 1 nodes are reporting degraded status on sync status: "True" type: NodeDegraded In the MCD logs we can see: I0819 11:38:55.089991 7239 update.go:2665] Removing SIGTERM protection E0819 11:38:55.090067 7239 writer.go:226] Marking Degraded due to: could not apply update: restarting coreos-update-ca-trust.service service failed. Error: error running systemctl restart coreos-update-ca-trust.service: Failed to restart coreos-update-ca-trust.service: Unit coreos-update-ca-trust.service not found.
Expected results:
No degradation should happen. The certificate should be added without problems.
Additional info:
- blocks
-
OCPBUGS-41686 MCPs with RHEL nodes are degraded when a userCA bundle is added to the cluster
- Closed
- is cloned by
-
OCPBUGS-41686 MCPs with RHEL nodes are degraded when a userCA bundle is added to the cluster
- Closed
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update