Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-3864

CO should warn if non-default roles are used but the kubelet roles variables have not been tailored

    XMLWordPrintable

Details

    • 2
    • CMP Sprint 56, CMP Sprint 57
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      Cause: Some compliance operator remediations require variables as input.

      Consequence: Remediations without variables set can be applied cluster wide and result in stuck nodes even though it looks like the remediation has been applied.

      Fix: Update the compliance operator, rerun the scan, and reapply remediations.

      Result: The compliance operator will properly validate if a variable needs to be supplied using a TailoredProfile for a particular remediation.
      Show
      Cause: Some compliance operator remediations require variables as input. Consequence: Remediations without variables set can be applied cluster wide and result in stuck nodes even though it looks like the remediation has been applied. Fix: Update the compliance operator, rerun the scan, and reapply remediations. Result: The compliance operator will properly validate if a variable needs to be supplied using a TailoredProfile for a particular remediation.
    • Bug Fix

    Description

      Description of problem:

      Node in NotReady,SchedulingDisabled after applying complianceremediations for ocp4-pci-dss-node-wrscan-kubelet-enable-protect-kernel-defaults 
      I performed the test case on OCP 4.11.13 cluster and Power architecture

      Version-Release number of selected component (if applicable):

      compliance-operator.v0.1.59

      How reproducible:

      Every time

      Steps to Reproduce:

      I have attached the detailed steps with logs in attachment.

      1. Install Compliance operator
      2. Set label 1 rhcos worker nodes out of all workers
      3. Create a custom MachineConfigPool to bring them in wrscan pool
      4. Create ScanSetting auto-apply with remediations enable
      5. Create ScanSettingBinding using auto-apply scansetting
      6. Check all failed rules through compliancecheckresult object  
      7. check all rules are applied remediations except ocp4-pci-dss-node-kubelet-enable-protect-kernel-defaults. 
      8. Verify kubeletconfigs are created for compliance operator and machineConfigs are created for ocp4-kubelet-enable-protect-kernel-sysctl rule to apply remediation 
      9. Rerun scan
      10. Check rule ocp4-kubelet-enable-protect-kernel-defaults status and Confirm machineConfigPool has been updated after rescan
      
      After these steps the one of the worker node stuck in  NotReady,SchedulingDisabled state.

      Actual results:

      # oc get mcp
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGEmaster   rendered-master-7db0bc2d0d2023b1ea18079c86220edc   True      False      False      3              3                   3                     0                      128mworker   rendered-worker-0a178a44beebe95c2f912f3868bc1bad   False     True       False      2              0                   0                     0                      128mwrscan   rendered-wrscan-f8e480d30438d642bd4c6792b4e48f1b   True      False      False      1              1                   1                     0                      56m
      
      # oc get node
      NAME                                           STATUS                        ROLES           AGE    VERSIONlon06-master-0.rdr-vard-ocp-411l-upi.ibm.com   Ready                         master          136m   v1.24.6+5157800lon06-master-1.rdr-vard-ocp-411l-upi.ibm.com   Ready                         master          136m   v1.24.6+5157800lon06-master-2.rdr-vard-ocp-411l-upi.ibm.com   Ready                         master          130m   v1.24.6+5157800lon06-worker-0.rdr-vard-ocp-411l-upi.ibm.com   NotReady,SchedulingDisabled   worker          113m   v1.24.6+5157800lon06-worker-1.rdr-vard-ocp-411l-upi.ibm.com   Ready                         worker,wrscan   111m   v1.24.6+5157800lon06-worker-2.rdr-vard-ocp-411l-upi.ibm.com   Ready                         worker          110m   v1.24.6+5157800

      Expected results:

      All Nodes should updated and healthy. 

      Additional info:

       

      Attachments

        Activity

          People

            jhrozek@redhat.com Jakub Hrozek
            vahirwad Varad Ahirwadkar
            Xiaojie Yuan Xiaojie Yuan
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: