Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-42415

OCP Node Fails SSH Authentication After Modifying sshd_config and Triggers MachineConfigPool Content Mismatch Error

XMLWordPrintable

    • Important
    • None
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      
      When modifying the `/etc/ssh/sshd_config` file to allow password-based SSH access (PasswordAuthentication yes), the OCP node reports an "unexpected on-disk state" error due to a mismatch between the on-disk configuration and the MachineConfig's expected file content. This breaks SSH access with public key authentication and causes MachineConfigPool errors.
      
          

      Version-Release number of selected component (if applicable):

      OCP 4.14
          

      How reproducible:

      This issue is consistently reproducible after making changes to the sshd_config file on a worker node and attempting to SSH into the node.
          

      Steps to Reproduce:

      1. Modify the /etc/ssh/sshd_config file to change PasswordAuthentication from no to yes.
      2. Restart the sshd service on the node.
      3. Attempt to SSH into the node as the core user using a public key.
      4. Observe the MachineConfigPool error about unexpected on-disk state.
      
          

      Actual results:

      1. SSH connection fails with the error: `Permission denied (publickey).`
      2. MCP reports a mismatch between the expected and actual contents of /etc/ssh/sshd_config.
      3. Attempts to use password-based authentication result in an error.
      
          

      Expected results:

      1. SSH connection using public key authentication should work as expected.
      2. MachineConfig should handle changes in SSH configuration through the expected method (MachineConfig files), without causing errors.
      
          

      Additional info:

      
      1. This issue blocks the ability to modify SSH settings for troubleshooting or administrative purposes, particularly when working with OCP worker nodes.
      2. The customer is facing this issue while preparing to migrate their applications to production, and it is becoming a critical blocker.
      
          

              jerzhang@redhat.com Yu Qi Zhang
              rhn-support-vyoganan Vivek Yoganand A
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Anand Paladugu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: