-
Bug
-
Resolution: Done
-
Undefined
-
None
-
4.13
-
None
-
None
-
MCO Sprint 231
-
1
-
False
-
-
N/A
-
Bug Fix
-
Done
Description of problem:
The Machine Config Operator (MCO) makes use of the /etc/os-release and /usr/lib/os-release files to determine the underlying node OS so that it is possible to do branching based upon a different OS version. The files are read using github.com/ashcrow/osrelease and then the ID, VARIANT_ID, and VERSION_ID fields are thinly wrapped with some helper functions.
The helper functions appear to infer the RHEL version from the VERSION_ID field, based upon their names. For example, there is a function called IsEL9(), which checks if the VERSION_ID field is equal to 9. Furthermore, the unit tests for the helper functions assume that the VERSION_ID field is populated with the RHEL_VERSION field, not the VERSION_ID field. However in practice, the VERSION_ID field appears to have the OpenShift version in it, which breaks that assumption.
For example, the /etc/os-release and /usr/lib/os-release files contain the following information for an OpenShift 4.12 CI build:
NAME="Red Hat Enterprise Linux CoreOS" ID="rhcos" ID_LIKE="rhel fedora" VERSION="412.86.202301311551-0" VERSION_ID="4.12" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 412.86.202301311551-0 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8::coreos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.12/" BUG_REPORT_URL="https://access.redhat.com/labs/rhir/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.12" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.12" OPENSHIFT_VERSION="4.12" RHEL_VERSION="8.6" OSTREE_VERSION="412.86.202301311551-0"
Notice that the VERSION_ID contains the OCP version; not the RHEL version.
How reproducible:
Always
Steps to Reproduce:
- Launch a new cluster running on RHCOS 9 (Run clusterbot launch against PR: https://github.com/openshift/machine-config-operator/pull/3485)
- Get the /etc/os-release file content from a random node:
$ oc debug "node/$(oc get nodes -o=jsonpath='{.items[0].metadata.name}')" -- cat /host/etc/os-release
- Use the Go code at https://gist.github.com/cheesesashimi/89184074cd2fe066232c512db4969015, to read the contents, modifying it to include the contents of the RHCOS9 /etc/os-release file retrieved in Step 2.
Actual results:
NAME="Red Hat Enterprise Linux CoreOS" ID="rhcos" ID_LIKE="rhel fedora" VERSION="413.90.202212151724-0" VERSION_ID="4.13" VARIANT="CoreOS" VARIANT_ID=coreos PLATFORM_ID="platform:el9" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 413.90.202212151724-0 (Plow)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:9::coreos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.13/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.13" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.13" OPENSHIFT_VERSION="4.13" RHEL_VERSION="9.0" OSTREE_VERSION="413.90.202212151724-0" (daemon.OperatingSystem) { ID: (string) (len=5) "rhcos", VariantID: (string) (len=6) "coreos", VersionID: (string) (len=4) "4.13" } IsEL(): true IsEL9(): false IsFCOS(): false IsSCOS(): false IsCoreOSVariant(): true IsLikeTraditionalRHEL7(): false ToPrometheusLabel(): RHCOS
Expected results:
Given the above input, I would have expected the code provided in the Gist above to produce output similar to this:
(daemon.OperatingSystem) { ID: (string) (len=5) "rhcos", VariantID: (string) (len=6) "coreos", VersionID: (string) (len=4) "9.0" } IsEL(): true IsEL9(): true IsFCOS(): false IsSCOS(): false IsCoreOSVariant(): true IsLikeTraditionalRHEL7(): false ToPrometheusLabel(): RHCOS
Additional info:
- We most likely need to adjust the OperatingSystem code to look for the RHEL_VERSION, where available. However, I would like someone from the CoreOS team to review the assumptions this makes.
- We should write an MCO e2e test that verifies this against a live node so that we're informed if anything were to change in the form of a failed test.
- We'll also need to account for FCOS and SCOS cases as well.
- blocks
-
MCO-116 Support ssh-key-dir on RHCOS 9
- Closed
- links to