Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.16
Component/s: Machine Config Operator
Labels:

Severity:
Moderate
Regression:
None
Sprint:
MCO Sprint 259
sprint_count:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:

Hide
* Previously, if you enabled on-cluster layering for your cluster and you attempted to configure kernel arguments in the machine configuration, machine config pools (MCPs) and nodes entered a degraded state. This happened because of a configuration mismatch. With this release, a check for kernel arguments for a cluster with OCL-enabled ensures that the arguments are configured and applied to nodes in the cluster. This update prevents any mismatch that previously occurred between the machine configuration and the node configuration. (link:https://issues.redhat.com/browse/OCPBUGS-42081[*~~OCPBUGS-42081~~*])

Show
* Previously, if you enabled on-cluster layering for your cluster and you attempted to configure kernel arguments in the machine configuration, machine config pools (MCPs) and nodes entered a degraded state. This happened because of a configuration mismatch. With this release, a check for kernel arguments for a cluster with OCL-enabled ensures that the arguments are configured and applied to nodes in the cluster. This update prevents any mismatch that previously occurred between the machine configuration and the node configuration. (link: https://issues.redhat.com/browse/OCPBUGS-42081 [* OCPBUGS-42081 *])
Release Note Type:
Bug Fix
Release Note Status:
Done
Target Version:

4.17.z
Target Backport Versions:

4.17.z, 4.16.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

This is a clone of issue ~~OCPBUGS-34647~~. The following is the description of the original issue:
—
Description of problem:

When we enable OCB functionality and we create a MC that configures an eforcing=0 kernel argumnent the MCP is degraded reporting this message

              {
                  "lastTransitionTime": "2024-05-30T09:37:06Z",
                  "message": "Node ip-10-0-29-166.us-east-2.compute.internal is reporting: \"unexpected on-disk state validating against quay.io/mcoqe/layering@sha256:654149c7e25a1ada80acb8eedc3ecf9966a8d29e9738b39fcbedad44ddd15ed5: missing expected kernel arguments: [enforcing=0]\"",
                  "reason": "1 nodes are reporting degraded status on sync",
                  "status": "True",
                  "type": "NodeDegraded"
              },

Version-Release number of selected component (if applicable):

IPI on AWS

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.16.0-0.nightly-2024-05-30-021120   True        False         97m     Error while reconciling 4.16.0-0.nightly-2024-05-30-021120: the cluster operator olm is not available

How reproducible:

Alwasy

Steps to Reproduce:

    1. Enable techpreview
$ oc patch featuregate cluster --type=merge -p '{"spec":{"featureSet": "TechPreviewNoUpgrade"}}'

    2. Configure a MSOC resource to enable OCB functionality in the worker pool

When we hit this problem we were using the mcoqe quay repository.
A copy of the pull-secret for baseImagePullSecret and renderedImagePushSecret and no currentImagePullSecret configured.

apiVersion: machineconfiguration.openshift.io/v1alpha1
kind: MachineOSConfig
metadata:
  name: worker
spec:
  machineConfigPool:
    name: worker
#  buildOutputs:
#    currentImagePullSecret:
#      name: ""
  buildInputs:
    imageBuilder:
      imageBuilderType: PodImageBuilder
    baseImagePullSecret:
      name: pull-copy 
    renderedImagePushSecret:
      name: pull-copy 
    renderedImagePushspec: "quay.io/mcoqe/layering:latest"

    3. Create a MC to use enforing=0 kernel argument

{
    "kind": "List",
    "apiVersion": "v1",
    "metadata": {},
    "items": [
        {
            "apiVersion": "machineconfiguration.openshift.io/v1",
            "kind": "MachineConfig",
            "metadata": {
                "labels": {
                    "machineconfiguration.openshift.io/role": "worker"
                },
                "name": "change-worker-kernel-selinux-gvr393x2"
            },
            "spec": {
                "config": {
                    "ignition": {
                        "version": "3.2.0"
                    }
                },
                "kernelArguments": [
                    "enforcing=0"
                ]
            }
        }
    ]
}

Actual results:

The worker MCP is degraded reporting this message:

oc get mcp worker -oyaml
....

              {
                  "lastTransitionTime": "2024-05-30T09:37:06Z",
                  "message": "Node ip-10-0-29-166.us-east-2.compute.internal is reporting: \"unexpected on-disk state validating against quay.io/mcoqe/layering@sha256:654149c7e25a1ada80acb8eedc3ecf9966a8d29e9738b39fcbedad44ddd15ed5: missing expected kernel arguments: [enforcing=0]\"",
                  "reason": "1 nodes are reporting degraded status on sync",
                  "status": "True",
                  "type": "NodeDegraded"
              },

Expected results:

The MC should be applied without problems and selinux should be using enforcing=0

Additional info:

blocks

OCPBUGS-42744 In OCB, "enforcing=0" kernel argument is degrading the MachineConfigPool

Closed

clones

OCPBUGS-34647 In OCB, "enforcing=0" kernel argument is degrading the MachineConfigPool

Closed

is blocked by

OCPBUGS-34647 In OCB, "enforcing=0" kernel argument is degrading the MachineConfigPool

Closed

is cloned by

OCPBUGS-42744 In OCB, "enforcing=0" kernel argument is degrading the MachineConfigPool

Closed

links to

openshift/machine-config-operator#4595: [release-4.17] OCPBUGS-42081: Check for kernel arg diff in updateOnClusterBuild

RHBA-2024:7922 OpenShift Container Platform 4.17.z bug fix update

(1 links to)

Assignee:: Urvashi Mohnani

Reporter:: OpenShift Prow Bot

QA Contact:: Sergio Regidor de la Rosa

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2024/09/17 1:03 PM

Updated:: 2024/10/16 2:41 AM

Resolved:: 2024/10/16 2:41 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates