-
Bug
-
Resolution: Done
-
Undefined
-
None
-
None
Description of problem:
restarting microshift around 10s after restarting can cause MicroShift failure because the etcd is not being shut down
Version-Release number of selected component (if applicable):
current main
How reproducible:
50%
Steps to Reproduce:
1. Stop MicroShift 2. systemctl start microshift --no-block 3. Wait ~10s 4. systemctl restart microshift sudo systemctl stop microshift ; sleep 3 ; sudo systemctl start microshift --no-block ; sleep 10; sudo systemctl restart microshift
Actual results:
MicroShift is failing to re-start. systemd prints error: Job for microshift.service failed because the service did not take the steps required by its unit configuration. See "systemctl status microshift.service" and "journalctl -xeu microshift.service" for details. There's following error in the microshift's journal: etcd I1030 12:13:09.198188 20361 manager.go:116] Starting etcd etcd I1030 12:13:09.198208 20361 etcd.go:93] starting etcd via systemd-run with args [--uid=root --scope --unit microshift-etcd --property Before=microshift.service --property BindsTo=microshift.service /usr/bin/microshift-etcd run] Failed to start transient scope unit: Unit microshift-etcd.scope was already loaded or has a fragment file. etcd W1030 12:13:09.204999 20361 etcd.go:110] etcd failed waiting on process to finish: exit status 1 etcd I1030 12:13:09.205012 20361 etcd.go:112] etcd process quit: exit status 1 etcd W1030 12:13:09.205017 20361 etcd.go:115] microshift-etcd process terminated prematurely, restarting MicroShift
Expected results:
Restarting soon after starting should not fail
Additional info:
Found when inspecting logs of a soon-to-be-reenabled metal periodic on ARM: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/44918/rehearse-44918-periodic-ci-openshift-microshift-main-ocp-metal-nightly-arm/1717904044741103616
- is duplicated by
-
USHIFT-1884 CI may fail when attempting to restart MicroShift
- Closed
- links to