-
Bug
-
Resolution: Not a Bug
-
Undefined
-
None
-
None
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
On an EC2 machine with RHEL 9.2 and MicroShift installed by mostly following https://access.redhat.com/documentation/en-us/red_hat_build_of_microshift/4.14/html-single/installing/index (only skipping the LVM part), enabled, and rebooted, and running, I eventually start to get wall messages from greenboot, and the same messages are in journal. However, those messages don't provide any information about what is the problem or how to fix it.
Version-Release number of selected component (if applicable):
microshift-4.14.2-202311091609.p0.gd80d6de.assembly.4.14.2.el9.x86_64 greenboot-0.15.4-1.el9.x86_64
How reproducible:
Seems deterministic.
Steps to Reproduce:
1. Have a t2.medium EC2 instance with 20 GB of disk and RHEL 9.2 installed, log in to it as root. 2. subscription-manager register --org ... --activationkey ... 3. subscription-manager config --rhsm.manage_repos=1 4. subscription-manager repos --enable rhocp-4.14-for-rhel-9-$(uname -m)-rpms --enable fast-datapath-for-rhel-9-$(uname -m)-rpms 5. dnf install -y microshift openshift-clients 6. Get pull secret from https://console.redhat.com/openshift/install/pull-secret and paste it to cat > /etc/crio/openshift-pull-secret 7. chmod 600 /etc/crio/openshift-pull-secret 8. systemctl enable microshift 9. systemctl start microshift 10. export KUBECONFIG=/var/lib/microshift/resources/kubeadmin/kubeconfig 11. Wait for oc get all -A to report all pods and deployments as running, ready, and available. 12. Check that journalctl -l | grep greenboot does not report anything. 13. Reboot the machine and log back to it. 14. Run journalctl -l | grep greenboot | sed 's/ip-.*\.internal //' 15. Wait ten second. 16. See what is on the terminal. 17. Run again journalctl -l | grep greenboot | sed 's/ip-.*\.internal //'
Actual results:
The first journalctl -l | grep greenboot | sed 's/ip-.*\.internal //' after reboot: Nov 17 12:33:38 systemd[1]: Starting greenboot Health Checks Runner... Nov 17 12:33:38 greenboot[638]: Running Required Health Check Scripts... Nov 17 12:33:38 00_required_scripts_start.sh[649]: Running greenboot Required Health Check Scripts Nov 17 12:33:38 greenboot[638]: Script '00_required_scripts_start.sh' SUCCESS Nov 17 12:33:38 greenboot[638]: Running Wanted Health Check Scripts... Nov 17 12:33:38 00_wanted_scripts_start.sh[657]: Running greenboot Wanted Health Check Scripts Nov 17 12:33:38 greenboot[638]: Script '00_wanted_scripts_start.sh' SUCCESS Nov 17 12:33:38 greenboot[638]: Running Required Health Check Scripts... The messages on the terminal Broadcast message from systemd-journald@ip-172-31-84-217.ec2.internal (Fri 2023-11-17 12:39:45 UTC): greenboot[638]: Script '40_microshift_running_check.sh' FAILURE (exit code '1'). Continuing... Broadcast message from systemd-journald@ip-172-31-84-217.ec2.internal (Fri 2023-11-17 12:39:45 UTC): greenboot[9294]: Boot Status is RED - Health Check FAILURE! Broadcast message from systemd-journald@ip-172-31-84-217.ec2.internal (Fri 2023-11-17 12:39:45 UTC): redboot-auto-reboot[9315]: SYSTEM is UNHEALTHY, but boot_counter is unset in grubenv. Manual intervention necessary. Message from syslogd@ip-172-31-84-217 at Nov 17 12:39:45 ... greenboot[638]:Script '40_microshift_running_check.sh' FAILURE (exit code '1'). Continuing... Message from syslogd@ip-172-31-84-217 at Nov 17 12:39:45 ... greenboot[9294]:Boot Status is RED - Health Check FAILURE! Message from syslogd@ip-172-31-84-217 at Nov 17 12:39:45 ... redboot-auto-reboot[9315]:SYSTEM is UNHEALTHY, but boot_counter is unset in grubenv. Manual intervention necessary. The second journalctl -l | grep greenboot | sed 's/ip-.*\.internal //' adds Nov 17 12:39:45 greenboot[638]: Script '40_microshift_running_check.sh' FAILURE (exit code '1'). Continuing... Nov 17 12:39:45 systemd[1]: greenboot-healthcheck.service: Main process exited, code=exited, status=1/FAILURE Nov 17 12:39:45 systemd[1]: greenboot-healthcheck.service: Failed with result 'exit-code'. Nov 17 12:39:45 systemd[1]: Failed to start greenboot Health Checks Runner. Nov 17 12:39:45 systemd[1]: Dependency failed for greenboot Success Scripts Runner. Nov 17 12:39:45 systemd[1]: greenboot-task-runner.service: Job greenboot-task-runner.service/start failed with result 'dependency'. Nov 17 12:39:45 systemd[1]: greenboot-grub2-set-success.service: Job greenboot-grub2-set-success.service/start failed with result 'dependency'. Nov 17 12:39:45 systemd[1]: greenboot-healthcheck.service: Triggering OnFailure= dependencies. Nov 17 12:39:45 systemd[1]: greenboot-healthcheck.service: Consumed 48.700s CPU time. Nov 17 12:39:45 systemd[1]: Starting greenboot Failure Scripts Runner... Nov 17 12:39:45 greenboot[9294]: Boot Status is RED - Health Check FAILURE! Nov 17 12:39:45 greenboot[9294]: Running Red Scripts... Nov 17 12:39:45 greenboot[9294]: Script '40_microshift_pre_rollback.sh' SUCCESS Nov 17 12:39:45 systemd[1]: Finished greenboot Failure Scripts Runner. Nov 17 12:39:45 systemd[1]: Starting greenboot MotD Generator... Nov 17 12:39:45 greenboot-status[9325]: Script '40_microshift_running_check.sh' FAILURE (exit code '1'). Continuing... Nov 17 12:39:45 greenboot-status[9325]: Boot Status is RED - Health Check FAILURE! Nov 17 12:39:45 greenboot-status[9325]: SYSTEM is UNHEALTHY, but boot_counter is unset in grubenv. Manual intervention necessary. Nov 17 12:39:45 systemd[1]: Finished greenboot MotD Generator.
Expected results:
No error, or clear information what is wrong about boot_counter is unset in grubenv. and what type of Manual intervention necessary is desired.
Additional info:
- is related to
-
USHIFT-1667 Port MicroShift health check to Go
-
- Closed
-