Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: openshift-4.15
Affects Version/s: None
Component/s: None
Labels:
- microshift-no-backport

Story Points:
5
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

Sprint:
uShift Sprint 244, uShift Sprint 245, uShift Sprint 246

Target Version:

openshift-4.15

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Description of problem:

restarting microshift around 10s after restarting can cause MicroShift failure because the etcd is not being shut down

Version-Release number of selected component (if applicable):

current main

How reproducible:

50%

Steps to Reproduce:

1. Stop MicroShift
2. systemctl start microshift --no-block
3. Wait ~10s
4. systemctl restart microshift

sudo systemctl stop microshift ; sleep 3 ; sudo systemctl start microshift --no-block ; sleep 10; sudo systemctl restart microshift

Actual results:

MicroShift is failing to re-start. systemd prints error:

Job for microshift.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status microshift.service" and "journalctl -xeu microshift.service" for details.

There's following error in the microshift's journal:

etcd I1030 12:13:09.198188   20361 manager.go:116] Starting etcd
etcd I1030 12:13:09.198208   20361 etcd.go:93] starting etcd via systemd-run with args [--uid=root --scope --unit microshift-etcd --property Before=microshift.service --property BindsTo=microshift.service /usr/bin/microshift-etcd run]
Failed to start transient scope unit: Unit microshift-etcd.scope was already loaded or has a fragment file.
etcd W1030 12:13:09.204999   20361 etcd.go:110] etcd failed waiting on process to finish: exit status 1
etcd I1030 12:13:09.205012   20361 etcd.go:112] etcd process quit: exit status 1
etcd W1030 12:13:09.205017   20361 etcd.go:115] microshift-etcd process terminated prematurely, restarting MicroShift

Expected results:

Restarting soon after starting should not fail

Additional info:

Found when inspecting logs of a soon-to-be-reenabled metal periodic on ARM:
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/44918/rehearse-44918-periodic-ci-openshift-microshift-main-ocp-metal-nightly-arm/1717904044741103616

is duplicated by

USHIFT-1884 CI may fail when attempting to restart MicroShift

Closed

links to

openshift/microshift#2575: USHIFT-1814: Restart during startup robustness

openshift/microshift#2688: USHIFT-1814: Run cmd to stop etcd in its own process group

Assignee:: Patryk Matuszak

Reporter:: Patryk Matuszak

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2023/10/30 4:14 PM

Updated:: 2024/03/21 10:02 AM

Resolved:: 2023/12/18 2:20 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates