Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-23255

Baremetal clusters installed with the agent installer are not skipping the first boot if they use FIPS

    XMLWordPrintable

Details

    Description

      Description of problem:

      When a cluster is using FIPS in an installation with the agent installer, the reboot in the machine-config-daemon-firstboot.service is not skipped.
      
      Since https://issues.redhat.com/browse/MCO-706 the agent installer should be able to skip the firstboot service reboot.
      
       

      Version-Release number of selected component (if applicable):

      4.15
       

      How reproducible:

      Always
       

      Steps to Reproduce:

      1. We cause these prow jobs to install a cluster
      
      without fips (HA): periodic-ci-openshift-openshift-tests-private-release-4.15-amd64-nightly-baremetal-pxe-ha-agent-ipv4-static-connected-f14
      
      with fips (SNO):  periodic-ci-openshift-openshift-tests-private-release-4.15-amd64-nightly-baremetal-sno-agent-ipv4-static-connected-f7
      
      
      We can find the firstboot service's logs in the must-gather.tar file.
      
      2.
      3.
      

      Actual results:

      In the machine-config-daemon-firstboot.service logs we can see that the reboot is not skipped when the installation is using fips=true.
      
      You can find the logs in the "additional info" section below.
      
       

      Expected results:

      The firstboot service should skip the reboot in the installation.
       

      Additional info:

      This is the machine-config-daemon-firstboot logs for a baremetal HA cluster with fips and installed using agent installer: (FIRST REBOOT NOT SKIPPED)
      
      
      Nov 14 11:26:59 worker-00 systemd[1]: Starting Machine Config Daemon Firstboot...
      Nov 14 11:26:59 worker-00 sh[4182]: sed: can't read /etc/yum.repos.d/*.repo: No such file or directory
      Nov 14 11:26:59 worker-00 podman[4183]: W1114 11:26:59.393738       1 daemon.go:1673] Failed to persist NIC names: open /rootfs/etc/systemd/network: no such file or directory
      Nov 14 11:26:59 worker-00 podman[4296]: I1114 11:26:59.866300    4348 daemon.go:457] container is rhel8, target is rhel9
      Nov 14 11:26:59 worker-00 podman[4296]: I1114 11:26:59.896550    4348 daemon.go:525] Invoking re-exec /run/bin/machine-config-daemon
      Nov 14 11:26:59 worker-00 podman[4296]: I1114 11:26:59.955660    4348 update.go:2120] Running: systemctl daemon-reload
      Nov 14 11:27:00 worker-00 podman[4296]: I1114 11:27:00.537582    4348 rpm-ostree.go:88] Enabled workaround for bug 2111817
      Nov 14 11:27:00 worker-00 podman[4296]: I1114 11:27:00.537944    4348 rpm-ostree.go:263] Linking ostree authfile to /etc/mco/internal-registry-pull-secret.json
      Nov 14 11:27:00 worker-00 podman[4296]: I1114 11:27:00.833062    4348 daemon.go:270] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a9bdfdf95023b7aebbbc9d5d335c973832fceb795ed943f365fefea7db646b66 (415.92.202311130854-0) 67df227c04e9306ddcb78331654ecf0ebb2cb1433498f9c12e832c7d5e74c1d9
      Nov 14 11:27:00 worker-00 podman[4296]: I1114 11:27:00.833303    4348 rpm-ostree.go:308] Running captured: rpm-ostree --version
      Nov 14 11:27:00 worker-00 podman[4296]: I1114 11:27:00.893156    4348 daemon.go:1076] rpm-ostree has container feature
      Nov 14 11:27:00 worker-00 podman[4296]: I1114 11:27:00.893582    4348 rpm-ostree.go:308] Running captured: rpm-ostree kargs
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.008588    4348 update.go:2157] Adding SIGTERM protection
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.008821    4348 update.go:599] Checking Reconcilable for config mco-empty-mc to rendered-worker-ef30fce69107b4fc38dc1020038ebd6a
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.009121    4348 update.go:1064] FIPS is configured and enabled
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.009345    4348 update.go:2135] Starting update from mco-empty-mc to rendered-worker-ef30fce69107b4fc38dc1020038ebd6a: &{osUpdate:true kargs:true fips:false passwd:false files:false units:false kernelType:false extensions:false}
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.055403    4348 update.go:1349] Updating files
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.055415    4348 update.go:1412] Deleting stale data
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.055419    4348 update.go:1818] updating the permission of the kubeconfig to: 0o600
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.055484    4348 update.go:1784] Checking if absent users need to be disconfigured
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.055610    4348 update.go:2210] Already in desired image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a9bdfdf95023b7aebbbc9d5d335c973832fceb795ed943f365fefea7db646b66
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.055616    4348 update.go:2120] Running: rpm-ostree cleanup -p
      Nov 14 11:27:01 worker-00 podman[4296]: Deployments unchanged.
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.224788    4348 update.go:2135] Running rpm-ostree [kargs --append=systemd.unified_cgroup_hierarchy=1 --append=cgroup_no_v1="all" --append=psi=1]
      Nov 14 11:27:01 worker-00 podman[4296]: I1114 11:27:01.271647    4348 update.go:2120] Running: rpm-ostree kargs --append=systemd.unified_cgroup_hierarchy=1 --append=cgroup_no_v1="all" --append=psi=1
      Nov 14 11:27:03 worker-00 podman[4296]: Staging deployment...done
      Nov 14 11:27:05 worker-00 podman[4296]: Changes queued for next boot. Run "systemctl reboot" to start a reboot
      Nov 14 11:27:05 worker-00 podman[4296]: I1114 11:27:05.081854    4348 update.go:2135] Rebooting node
      Nov 14 11:27:05 worker-00 podman[4296]: I1114 11:27:05.127794    4348 update.go:2165] Removing SIGTERM protection
      Nov 14 11:27:05 worker-00 podman[4296]: I1114 11:27:05.127853    4348 update.go:2135] initiating reboot: Completing firstboot provisioning to rendered-worker-ef30fce69107b4fc38dc1020038ebd6a
      Nov 14 11:27:05 worker-00 podman[4296]: I1114 11:27:05.235062    4348 update.go:2135] reboot successful
      Nov 14 11:27:05 worker-00 systemd[1]: machine-config-daemon-firstboot.service: Main process exited, code=killed, status=15/TERM
      Nov 14 11:27:05 worker-00 systemd[1]: machine-config-daemon-firstboot.service: Failed with result 'signal'.
      Nov 14 11:27:05 worker-00 systemd[1]: Stopped Machine Config Daemon Firstboot.
      -- Boot 2f510f83bdb047bb921fc429d67b8e6a --
      
      
      
      
      This is the logs for a baremetal HA cluster without fips and installed using agent installer:  (FIST REBOOT SKIPPED)
      
      
      Nov 08 14:27:30 worker-00 systemd[1]: Starting Machine Config Daemon Firstboot...
      Nov 08 14:27:30 worker-00 sh[4171]: sed: can't read /etc/yum.repos.d/*.repo: No such file or directory
      Nov 08 14:27:30 worker-00 podman[4172]: W1108 14:27:30.970986       1 daemon.go:1673] Failed to persist NIC names: open /rootfs/etc/systemd/network: no such file or directory
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.172975    4320 daemon.go:457] container is rhel8, target is rhel9
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.202238    4320 daemon.go:525] Invoking re-exec /run/bin/machine-config-daemon
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.237492    4320 update.go:2120] Running: systemctl daemon-reload
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.436217    4320 rpm-ostree.go:88] Enabled workaround for bug 2111817
      Nov 08 14:27:31 worker-00 podman[4273]: E1108 14:27:31.436346    4320 rpm-ostree.go:285] Merged secret file could not be validated; defaulting to cluster pull secret <nil>
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.436375    4320 rpm-ostree.go:263] Linking ostree authfile to /var/lib/kubelet/config.json
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.555415    4320 daemon.go:270] Booted osImageURL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e03c9248f78a107efb8b12430d46304e8d93981d23fd932e159d518ed675bc92 (415.92.202311061558-0) b8e1dca18619a2e497edf5346d5018615a226da380989ef6720a1a8cdc27adeb
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.555920    4320 rpm-ostree.go:308] Running captured: rpm-ostree --version
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.571985    4320 daemon.go:1076] rpm-ostree has container feature
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.572484    4320 rpm-ostree.go:308] Running captured: rpm-ostree kargs
      Nov 08 14:27:31 worker-00 podman[4273]: I1108 14:27:31.600313    4320 update.go:186] No changes from mco-empty-mc to rendered-worker-30da1eef7a5d361fc395f2726c8210d5
      Nov 08 14:27:31 worker-00 systemd[1]: Finished Machine Config Daemon Firstboot.
       

      Attachments

        Issue Links

          Activity

            People

              oamizur Ori Amizur
              sregidor@redhat.com Sergio Regidor de la Rosa
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: