Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-19569

[rhc] Replace token.Wait() with token.WaitTimeout(timeout)

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • rhel-9.4
    • rhel-9.3.0
    • rhc
    • None
    • None
    • None
    • rhel-sst-insights
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None

      What were you trying to do that didn't work?

      Ran into an situation where rhc was reporting that it was connected, but cloud-connector was not showing it as connected.

      The rhc logs showed that the connection had been dropped:

       

      Sep 15 02:04:23 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: [rhcd] 2023/09/15 02:04:23 connection lost unexpectedly: pingresp not received, disconnecting  

      Please provide the package NVR for which bug is seen:

      rhc 0.2 

      How reproducible:

      Steps to reproduce

      1. https://issues.redhat.com/browse/CCT-123 
      2.  
      3.  

      Expected results

      An alternative approach could be use token.WaitTimeout()

      Actual results

      According to the cloud-connector logs, cloud-connector received the offline message but it never received an online message.

      I killed the rhc process to take a look at the go routine stack traces and found the following:

      Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: goroutine 840 [chan receive, 722 minutes]:
      Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: github.com/eclipse/paho%2emqtt%2egolang.(*baseToken).Wait(0xc000078001)
      Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]:         /builddir/build/BUILD/rhc/yggdrasil-0.2.1/vendor/github.com/eclipse/paho.mqtt.golang/token.go:73 +0x1f
      Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: main.publishConnectionStatus({0x55b80674aba8, 0xc00013e900}, 0xc000335410)
      Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]:         /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/mqtt.go:128 +0x4bb
      Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]: created by main.main.func2.2
      Sep 15 14:05:22 ci-vm-10-0-150-55.hosted.upshift.rdu2.redhat.com rhcd[330778]:         /builddir/build/BUILD/rhc/yggdrasil-0.2.1/cmd/yggd/main.go:250 +0x405
      

      It looks like rhc tried to send the connection status message (online message), but it hung on the token.Wait() call here https://github.com/RedHatInsights/yggdrasil/blob/yggdrasil-0.2/cmd/yggd/mqtt.go#L128 .  This call can hang indefinitely.  Here it appears to have hung for 722 minutes. 

              ldupont@redhat.com Link Dupont
              redakkan@redhat.com Rehana Raj Edakandiyil
              Link Dupont Link Dupont
              Qianqian Zhang Qianqian Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: