Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.17
Component/s: Machine Config Operator
Labels:
- mco-triaged
- pre-merge-tested

Severity:
Moderate
Regression:
None
Story Points:
5
Sprint:
MCO Sprint 259
sprint_count:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Type:
Release Note Not Required
Release Note Status:
In Progress
Target Version:

4.18.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

When we add a userCA bundle to a cluster that has MCPs with yum based rhel nodes, the MCP with rhel nodes are degraded.

Version-Release number of selected component (if applicable):

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.17.0-0.nightly-2024-08-18-131731   True        False         101m    Cluster version is 4.17.0-0.nightly-2024-08-18-131731

How reproducible:

Always

In the CI we found this issue running test case "[sig-mco] MCO security Author:sregidor-NonHyperShiftHOST-High-67660-MCS generates ignition configs with certs [Disruptive] [Serial]" on prow job periodic-ci-openshift-openshift-tests-private-release-4.17-amd64-nightly-gcp-ipi-workers-rhel8-fips-f28-destructive

Steps to Reproduce:

    1. Create a certificate 
    
   	$ openssl genrsa -out privateKey.pem 4096
    	$ openssl req -new -x509 -nodes -days 3600 -key privateKey.pem -out ca-bundle.crt -subj "/OU=MCO qe/CN=example.com"
    
    2. Add the certificate to the cluster
    
   	# Create the configmap with the certificate
	$ oc create cm cm-test-cert -n openshift-config --from-file=ca-bundle.crt
	configmap/cm-test-cert created

	#Configure the proxy with the new test certificate
	$ oc patch proxy/cluster --type merge -p '{"spec": {"trustedCA": {"name": "cm-test-cert"}}}'
	proxy.config.openshift.io/cluster patched
    
    3. Check the MCP status and the MCD logs

Actual results:

    
    The MCP is degraded
    $ oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-3251b00997d5f49171e70f7cf9b64776   True      False      False      3              3                   3                     0                      130m
worker   rendered-worker-05e7664fa4758a39f13a2b57708807f7   False     True       True       3              0                   0                     1                      130m

    We can see this message in the MCP
      - lastTransitionTime: "2024-08-19T11:00:34Z"
    message: 'Node ci-op-jr7hwqkk-48b44-6mcjk-rhel-1 is reporting: "could not apply
      update: restarting coreos-update-ca-trust.service service failed. Error: error
      running systemctl restart coreos-update-ca-trust.service: Failed to restart
      coreos-update-ca-trust.service: Unit coreos-update-ca-trust.service not found.\n:
      exit status 5"'
    reason: 1 nodes are reporting degraded status on sync
    status: "True"
    type: NodeDegraded

In the MCD logs we can see:

I0819 11:38:55.089991    7239 update.go:2665] Removing SIGTERM protection
E0819 11:38:55.090067    7239 writer.go:226] Marking Degraded due to: could not apply update: restarting coreos-update-ca-trust.service service failed. Error: error running systemctl restart coreos-update-ca-trust.service: Failed to restart coreos-update-ca-trust.service: Unit coreos-update-ca-trust.service not found.

Expected results:

	No degradation should happen. The certificate should be added without problems.

Additional info:

blocks

OCPBUGS-41686 MCPs with RHEL nodes are degraded when a userCA bundle is added to the cluster

Closed

is cloned by

OCPBUGS-41686 MCPs with RHEL nodes are degraded when a userCA bundle is added to the cluster

Closed

links to

openshift/machine-config-operator#4552: OCPBUGS-38632: MCPs with RHEL nodes are degraded when a userCA bundle is added to the cluster

RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update

Assignee:: David Joshy

Reporter:: Sergio Regidor de la Rosa

QA Contact:: Sergio Regidor de la Rosa

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2024/08/19 11:44 AM

Updated:: 2025/02/25 4:42 AM

Resolved:: 2025/02/25 4:42 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates