Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: None
Affects Version/s: 4.13
Component/s: Machine Config Operator
Labels:
- good-first-issue
- mco-triaged

Severity:
Moderate
Regression:
No
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Target Version:

4.13.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

This is a clone of issue ~~OCPBUGS-11832~~. The following is the description of the original issue:
—
Description of problem:

While trying to update build01 from 4.13.rc2->4.13.rc3, the MCO degraded upon trying to upgrade the first master node. The error being:

E0414 15:42:29.597388 2323546 writer.go:200] Marking Degraded due to: exit status 1

Which I mapped to this line:
https://github.com/openshift/machine-config-operator/blob/release-4.13/pkg/daemon/update.go#L1551

I think this error can be improved since it is a bit confusing, but that's not the main problem.

We noticed that the actual issue was that there is an existing "/home/core/.ssh" directory, that seemed to have been created by 4.13.rc2 (but could have been earlier), that belonged to the root user, as such when we attempted to create the folder via runuser core by hand, it failed with permission denied (and since we return the exec status, I think it just returned status 1 and not this error message).

I am currently not sure if we introduced something that caused this issue. There was an ssh (only on master pool) in that build01 cluster for 600 days already, so it must have worked in the past?

Workaround is to delete the .ssh folder and let the MCD recreate it

Version-Release number of selected component (if applicable):

4.13.rc3

How reproducible:

uncertain, but shouldn't be very high otherwise we would have ran into this in CI much more I think?

Steps to Reproduce:

1. create some 4.12 cluster with sshkey
2. upgrade to 4.13.rc2
3. upgrade to 4.13.rc3

Actual results:

Expected results:

Additional info:

clones

OCPBUGS-11832 SSHkeys fails to write on upgrade to 4.13.rc3

Closed

is blocked by

OCPBUGS-11832 SSHkeys fails to write on upgrade to 4.13.rc3

Closed

links to

openshift/machine-config-operator#3879: [release-4.13] OCPBUGS-17997: SSHkeys fails to write on upgrade to 4.13.rc3

RHBA-2023:5011 OpenShift Container Platform 4.13.z bug fix update

Assignee:: Ines Qian (Inactive)

Reporter:: OpenShift Prow Bot

QA Contact:: Sergio Regidor de la Rosa

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2023/08/22 8:57 PM

Updated:: 2024/01/10 5:45 PM

Resolved:: 2023/09/12 6:02 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates