Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: 4.12.0
Component/s: Machine Config Operator
Labels:
- triaged

Test Coverage:

+
Severity:
Moderate
Regression:
None
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Target Version:

4.13.0

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

When we create a MC that deploys a unit, and this unit has a content and the value mask=true, then the node becomes degraded because of a driftconfig error like this one:

E1118 16:41:42.485314    1900 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-e701d8c471184e3a66756b26b4b7dd33: mode mismatch for file: "/etc/systemd/system/maks-and-contents.service"; expected: -rw-r--r--/420/0644; received: Lrwxrwxrwx/134218239/01000000777

Version-Release number of selected component (if applicable):

4.12.0-0.nightly-2022-11-19-191518

How reproducible:

Always

Steps to Reproduce:

1. Create this machine config resource

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: mask-and-content
spec:
  config:
    ignition:
      version: 3.2.0
    systemd:
      units:
      - name: maks-and-contents.service
        mask: true
        contents: |
          [Unit]
          Description=Just random content

Actual results:

The worker MCP becomes degraded, and this error is reported in the MCD:

E1118 16:41:42.485314    1900 writer.go:200] Marking Degraded due to: unexpected on-disk state validating against rendered-worker-e701d8c471184e3a66756b26b4b7dd33: mode mismatch for file: "/etc/systemd/system/maks-and-contents.service"; expected: -rw-r--r--/420/0644; received: Lrwxrwxrwx/134218239/01000000777

Expected results:

Until config drift functionality was added, if a unit was masked, then the content was ignored.

If what happens is that this configuration is not allowed, the error message should report a more descriptive message.

Additional info:

It is not enough to restore the desiredConfig value in the degraded nodes. These are the steps to recover the node:

1. Edit the node's annotations and make  desiredConfig = currentConfig
2. Remove file /etc/machine-config-daemon/currentconfig  in the node
3. Flush the journal in the node. 
$ journalctl --rotate; journalctl --vacuum-time=1s

4. create the force file in the node
$ touch /run/machine-config-daemon-force

links to

OCP-56614 - [MCO][OCPBUGS-3909] Create unit with content and mask=true

openshift/machine-config-operator#3437: OCPBUGS-3909: Don't validate contents and mode for masked units

Assignee:: Zack Zlotnik

Reporter:: Sergio Regidor de la Rosa

QA Contact:: Sergio Regidor de la Rosa

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2022/11/21 8:46 AM

Updated:: 2023/05/23 2:28 PM

Resolved:: 2023/05/17 10:41 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates