Uploaded image for project: 'Cloud Infrastructure Security & Compliance'
  1. Cloud Infrastructure Security & Compliance
  2. CMP-3618

"rhcos4-moderate" remediation leads to "chrony-wait.service" timeouts

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • False
    • Moderate

      Description of problem:

      After upgrading to OpenShift Container Platform 4.18.15, the customer noticed that on all Nodes, the "chrony-wait.service" is in status failed:

      $ sudo systemctl status chrony-wait.service 
      × chrony-wait.service - Wait for chrony to synchronize system clock
           Loaded: loaded (/usr/lib/systemd/system/chrony-wait.service; disabled; preset: disabled)
           Active: failed (Result: timeout) since Wed 2025-07-09 12:27:39 UTC; 42min ago
             Docs: man:chronyc(1)
         Main PID: 1434 (code=exited, status=1/FAILURE)
              CPU: 113ms
      
      Jul 09 12:24:39 xxx-01-worker-az1-dp6ck systemd[1]: Starting Wait for chrony to synchronize system clock...
      Jul 09 12:27:39 xxx-01-worker-az1-dp6ck systemd[1]: chrony-wait.service: start operation timed out. Terminating.
      Jul 09 12:27:39 xxx-01-worker-az1-dp6ck systemd[1]: chrony-wait.service: Main process exited, code=exited, status=1/FAILURE
      Jul 09 12:27:39 xxx-01-worker-az1-dp6ck systemd[1]: chrony-wait.service: Failed with result 'timeout'.
      Jul 09 12:27:39 xxx-01-worker-az1-dp6ck systemd[1]: Failed to start Wait for chrony to synchronize system clock.

      The "chronyd.service" is working as expected. In OCPBUGS-59281 we then discovered that the "rhcos4-moderate" policy recommends setting the following in "/etc/chrony.conf":

      # Set chronyd as client-only.
      port 0
      
      # Disable chronyc from the network
      cmdport 0

      This is consistent with the following rules:

      However, this setting leads to the "chrony-wait.service" timing out with the following error messages:

      # /usr/bin/chronyc -h 127.0.0.1,::1 waitsync 0 0.1 0.0 1
      506 Cannot talk to daemon
      506 Cannot talk to daemon
      [..]

      We observe this issue on all clusters that were upgraded to OpenShift Container Platform 4.18.15.

      Version-Release number of selected component (if applicable):

      OpenShift Container Platform 4.18.15

      How reproducible:

      Always

      Steps to Reproduce:

      1. Install a cluster with OpenShift Container Platform 4.18.15
      2. Install the Compliance Operator and apply the "rhcos4-moderate" profile, remediate the "chrony" findings mentioned above
      3. Restart the OpenShift Nodes
      4. Log into an OpenShift Node using SSH
      5. Observe that the login message already shows there is a failed service ("chrony-wait.service")
      6. Execute "sudo systemctl status chrony-wait.service"

      Actual results:

      The service shows: "chrony-wait.service: Failed with result 'timeout'." due to the remediation being applied

      Expected results:

      • With the profile "rhcos4-moderate" applied, there are no failed services.
      • The chrony-wait service finishes as expected.

      Additional info:

      • Findings in OCPBUGS-59281
      • sosreport available in attached Support Case
      • must-gather available in attached Support Case

              Unassigned Unassigned
              rhn-support-skrenger Simon Krenger
              Xiaojie Yuan Xiaojie Yuan
              Maria Simon Marcos Maria Simon Marcos
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: