-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.16
-
Low
-
None
-
False
-
Description of problem:
When we scale up a new machineset to create a new node using a 4.1 boot image, the machine-config-daemon-firstboot intermittently report a panic. The panic does not avoid the node to join the cluster.
Version-Release number of selected component (if applicable):
IPI on AWS version: $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.16.0-0.nightly-2024-07-10-022831 True False 5h50m Cluster version is 4.16.0-0.nightly-2024-07-10-022831
How reproducible:
Intermittent
Steps to Reproduce:
1. Create a machineset using a 4.1 cloud image 2. Scale the machineset to create a new worker node 3. When the worker node is added, check the machine-config-daemon-firstboot service
Actual results:
Intermittently, machine-config-daemon-firstboot service will report a panic like this one: Jul 10 10:26:44 ip-10-0-15-193 podman[1435]: I0710 10:26:44.601164 1472 update.go:2618] Running: systemd-run --unit machine-config-daemon-update-rpmostree-via-container -p EnvironmentFile=-/etc/mco/proxy.env --collect --wait -- podman run --env-file /etc/mco/proxy.env --privileged --pid=host --net=host --rm -v /:/run/host quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:77d490c385a99006dfa39460a2266b88a897572177fed886cdab1c3a1447f3ef rpm-ostree ex deploy-from-self /run/host Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: I0710 10:27:14.765820 1472 update.go:2618] Running: setenforce 1 Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: time="2024-07-10T10:27:14Z" level=error msg="Error forwarding signal 15 to container f7493cc548e12405d132c677905a0bce68e1cc2377f2e28734e635d637479816: container has already been removed" Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: time="2024-07-10T10:27:14Z" level=error msg="Error forwarding signal 18 to container f7493cc548e12405d132c677905a0bce68e1cc2377f2e28734e635d637479816: container has already been removed" Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: panic: close of closed channel Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: goroutine 61 [running]: Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: panic(0x55d72326c8c0, 0x55d7233d96c0) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /usr/lib/golang/src/runtime/panic.go:556 +0x2cf fp=0xc000236ec0 sp=0xc000236e30 pc=0x55d721ebcfdf Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: runtime.closechan(0xc0002e80c0) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /usr/lib/golang/src/runtime/chan.go:335 +0x260 fp=0xc000236f10 sp=0xc000236ec0 pc=0x55d721e976f0 Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: github.com/containers/libpod/vendor/github.com/docker/docker/pkg/signal.StopCatch(0xc0002e80c0) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /builddir/build/BUILD/libpod-96ccc2edf597a191fe03eff98b2905788a26553f/_build/src/github.com/containers/libpod/vendor/github.com/docker/docker/pkg/signal/signal.go:26 +0x3b fp=0xc000236f28 sp=0xc000236f10 pc=0x55d7227310fb Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: main.ProxySignals.func1(0xc0002e80c0, 0xc0001f1bc0) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /builddir/build/BUILD/libpod-96ccc2edf597a191fe03eff98b2905788a26553f/_build/src/github.com/containers/libpod/cmd/podman/sigproxy.go:28 +0x1fa fp=0xc000236fd0 sp=0xc000236f28 pc=0x55d722b4affa Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: runtime.goexit() Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /usr/lib/golang/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc000236fd8 sp=0xc000236fd0 pc=0x55d721eeb871 Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: created by main.ProxySignals Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /builddir/build/BUILD/libpod-96ccc2edf597a191fe03eff98b2905788a26553f/_build/src/github.com/containers/libpod/cmd/podman/sigproxy.go:18 +0xa5 Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: goroutine 1 [syscall]: Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: syscall.Syscall(0xa6, 0xc000205a40, 0x0, 0x0, 0x0, 0x55d7220024ad, 0xc0002ee200) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /usr/lib/golang/src/syscall/asm_linux_amd64.s:18 +0x5 fp=0xc000174d38 sp=0xc000174d30 pc=0x55d721f05985 Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: github.com/containers/libpod/vendor/golang.org/x/sys/unix.Unmount(0xc0003144b0, 0x23, 0x0, 0x55d72200197c, 0xc000205980) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /builddir/build/BUILD/libpod-96ccc2edf597a191fe03eff98b2905788a26553f/_build/src/github.com/containers/libpod/vendor/golang.org/x/sys/unix/zsyscall_linux_amd64.go:1299 +0x8c fp=0xc000174d98 sp=0xc000174d38 pc=0x55d721ffbd9c Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: github.com/containers/libpod/vendor/github.com/containers/storage/pkg/mount.unmount(0xc0003144b0, 0x23, 0x0, 0x22, 0x21) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /builddir/build/BUILD/libpod-96ccc2edf597a191fe03eff98b2905788a26553f/_build/src/github.com/containers/libpod/vendor/github.com/containers/storage/pkg/mount/mounter_linux.go:56 +0x41 fp=0xc000174dd0 sp=0xc000174d98 pc=0x55d722002261 Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: github.com/containers/libpod/vendor/github.com/containers/storage/pkg/mount.ForceUnmount(0xc0003144b0, 0x23, 0x1, 0x0) Jul 10 10:27:14 ip-10-0-15-193 podman[1435]: /builddir/build/BUILD/libpod-96ccc2edf597a191fe03eff98b2905788a26553f/_build/src/github.com/containers/libpod/vendor/github.com/containers/storage/pkg/mount/mount.go:100 +0x64 fp=0xc000174e10 sp=0xc000174dd0 pc=0x55d722001fb4
Expected results:
No panic should happen
Additional info:
The panic does not break the functionality, the node is rebooted and it can join without problems to the cluster.
- is related to
-
OCPBUGS-28974 Machine stuck in Provisioned when the cluster is upgraded from 4.1 to 4.15
- Closed