-
Bug
-
Resolution: Done
-
Undefined
-
None
-
4.13
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
MCO Sprint 231
-
1
-
Done
-
Bug Fix
-
N/A
-
None
-
None
-
None
-
None
Description of problem:
The Machine Config Operator (MCO) makes use of the /etc/os-release and /usr/lib/os-release files to determine the underlying node OS so that it is possible to do branching based upon a different OS version. The files are read using github.com/ashcrow/osrelease and then the ID, VARIANT_ID, and VERSION_ID fields are thinly wrapped with some helper functions.
The helper functions appear to infer the RHEL version from the VERSION_ID field, based upon their names. For example, there is a function called IsEL9(), which checks if the VERSION_ID field is equal to 9. Furthermore, the unit tests for the helper functions assume that the VERSION_ID field is populated with the RHEL_VERSION field, not the VERSION_ID field. However in practice, the VERSION_ID field appears to have the OpenShift version in it, which breaks that assumption.
For example, the /etc/os-release and /usr/lib/os-release files contain the following information for an OpenShift 4.12 CI build:
NAME="Red Hat Enterprise Linux CoreOS" ID="rhcos" ID_LIKE="rhel fedora" VERSION="412.86.202301311551-0" VERSION_ID="4.12" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 412.86.202301311551-0 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8::coreos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.12/" BUG_REPORT_URL="https://access.redhat.com/labs/rhir/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.12" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.12" OPENSHIFT_VERSION="4.12" RHEL_VERSION="8.6" OSTREE_VERSION="412.86.202301311551-0"
Notice that the VERSION_ID contains the OCP version; not the RHEL version.
How reproducible:
Always
Steps to Reproduce:
- Launch a new cluster running on RHCOS 9 (Run clusterbot launch against PR: https://github.com/openshift/machine-config-operator/pull/3485)
- Get the /etc/os-release file content from a random node:
$ oc debug "node/$(oc get nodes -o=jsonpath='{.items[0].metadata.name}')" -- cat /host/etc/os-release
- Use the Go code at https://gist.github.com/cheesesashimi/89184074cd2fe066232c512db4969015, to read the contents, modifying it to include the contents of the RHCOS9 /etc/os-release file retrieved in Step 2.
Actual results:
NAME="Red Hat Enterprise Linux CoreOS"
ID="rhcos"
ID_LIKE="rhel fedora"
VERSION="413.90.202212151724-0"
VERSION_ID="4.13"
VARIANT="CoreOS"
VARIANT_ID=coreos
PLATFORM_ID="platform:el9"
PRETTY_NAME="Red Hat Enterprise Linux CoreOS 413.90.202212151724-0 (Plow)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:9::coreos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.13/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform"
REDHAT_BUGZILLA_PRODUCT_VERSION="4.13"
REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform"
REDHAT_SUPPORT_PRODUCT_VERSION="4.13"
OPENSHIFT_VERSION="4.13"
RHEL_VERSION="9.0"
OSTREE_VERSION="413.90.202212151724-0"
(daemon.OperatingSystem) {
ID: (string) (len=5) "rhcos",
VariantID: (string) (len=6) "coreos",
VersionID: (string) (len=4) "4.13"
}
IsEL(): true
IsEL9(): false
IsFCOS(): false
IsSCOS(): false
IsCoreOSVariant(): true
IsLikeTraditionalRHEL7(): false
ToPrometheusLabel(): RHCOS
Expected results:
Given the above input, I would have expected the code provided in the Gist above to produce output similar to this:
(daemon.OperatingSystem) {
ID: (string) (len=5) "rhcos",
VariantID: (string) (len=6) "coreos",
VersionID: (string) (len=4) "9.0"
}
IsEL(): true
IsEL9(): true
IsFCOS(): false
IsSCOS(): false
IsCoreOSVariant(): true
IsLikeTraditionalRHEL7(): false
ToPrometheusLabel(): RHCOS
Additional info:
- We most likely need to adjust the OperatingSystem code to look for the RHEL_VERSION, where available. However, I would like someone from the CoreOS team to review the assumptions this makes.
- We should write an MCO e2e test that verifies this against a live node so that we're informed if anything were to change in the form of a failed test.
- We'll also need to account for FCOS and SCOS cases as well.
- blocks
-
MCO-116 Support ssh-key-dir on RHCOS 9
-
- Closed
-
- links to