-
Bug
-
Resolution: Done-Errata
-
Critical
-
rhel-9.3.0
-
None
-
None
-
None
-
rhel-sst-insights
-
None
-
False
-
-
None
-
None
-
Pass
-
None
-
None
What were you trying to do that didn't work?
Ran into an situation where rhc was reporting that it was connected, but cloud-connector was not showing it as connected.
The rhc logs showed that the connection had been dropped:
Sep 15 02:04:23 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: [rhcd] 2023/09/15 02:04:23 connection lost unexpectedly: pingresp not received, disconnecting
Please provide the package NVR for which bug is seen:
rhc 0.2
How reproducible:
Steps to reproduce
Expected results
An alternative approach could be use token.WaitTimeout()
Actual results
According to the cloud-connector logs, cloud-connector received the offline message but it never received an online message.
I killed the rhc process to take a look at the go routine stack traces and found the following:
Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: goroutine 840 [chan receive, 722 minutes]: Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: github.com/eclipse/paho%2emqtt%2egolang.(*baseToken).Wait(0xc000078001) Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: /builddir/build/BUILD/rhc/yggdrasil-0.2.1/vendor/github.com/eclipse/paho.mqtt.golang/token.go:73 +0x1f Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: main.publishConnectionStatus({0x55b80674aba8, 0xc00013e900}, 0xc000335410) Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/mqtt.go:128 +0x4bb Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: created by main.main.func2.2 Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:250 +0x405
It looks like rhc tried to send the connection status message (online message), but it hung on the token.Wait() call here https://github.com/RedHatInsights/yggdrasil/blob/yggdrasil-0.2/cmd/yggd/mqtt.go#L128 . This call can hang indefinitely. Here it appears to have hung for 722 minutes.
- links to
-
RHBA-2024:127866 rhc update