Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8474

Undiagnosed panic in cloud-provider-azure pod

    XMLWordPrintable

Details

    Description

      Description of problem:

      The Azure CCM will panic when it loses its leader election lease. This is contrary to the behaviour of other components which exit intentionally.
      
      See https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-azure-modern/1632791244243472384
      
      

      Version-Release number of selected component (if applicable):

      
      

      How reproducible:

      Force the CCM to lose leader election, can happen during upgrades
      

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      Code will panic, eg 
      
      E0306 18:09:14.315039       1 runtime.go:77] Observed a panic: leaderelection lost
      goroutine 1 [running]:
      k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1adc660?, 0x219b9c0})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:75 +0x99
      k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x81e22e?})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:49 +0x75
      panic({0x1adc660, 0x219b9c0})
      	/usr/lib/golang/src/runtime/panic.go:884 +0x212
      sigs.k8s.io/cloud-provider-azure/cmd/cloud-controller-manager/app.NewCloudControllerManagerCommand.func1.1()
      	/go/src/github.com/openshift/cloud-provider-azure/cmd/cloud-controller-manager/app/controllermanager.go:138 +0x27
      k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run.func1()
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:203 +0x1f
      k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc0002c0d80, {0x21bce08, 0xc0001ac008})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:213 +0x14d
      k8s.io/client-go/tools/leaderelection.RunOrDie({0x21bce08, 0xc0001ac008}, {{0x21c0e00, 0xc0002c0c60}, 0x1fe5d61a00, 0x18e9b26e00, 0x60db88400, {0xc000418080, 0x1fc4978, 0x0}, ...})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:226 +0x94
      sigs.k8s.io/cloud-provider-azure/cmd/cloud-controller-manager/app.NewCloudControllerManagerCommand.func1(0xc000170000?, {0x1ea43e2?, 0xd?, 0xd?})
      	/go/src/github.com/openshift/cloud-provider-azure/cmd/cloud-controller-manager/app/controllermanager.go:130 +0x3a7
      github.com/spf13/cobra.(*Command).execute(0xc000170000, {0xc00019e010, 0xd, 0xd})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/github.com/spf13/cobra/command.go:876 +0x67b
      github.com/spf13/cobra.(*Command).ExecuteC(0xc000170000)
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/github.com/spf13/cobra/command.go:990 +0x3bd
      github.com/spf13/cobra.(*Command).Execute(...)
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/github.com/spf13/cobra/command.go:918
      main.main()
      	/go/src/github.com/openshift/cloud-provider-azure/cmd/cloud-controller-manager/controller-manager.go:47 +0xc5
      panic: leaderelection lost [recovered]
      	panic: leaderelection lost
      
      goroutine 1 [running]:
      k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x81e22e?})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:56 +0xd7
      panic({0x1adc660, 0x219b9c0})
      	/usr/lib/golang/src/runtime/panic.go:884 +0x212
      sigs.k8s.io/cloud-provider-azure/cmd/cloud-controller-manager/app.NewCloudControllerManagerCommand.func1.1()
      	/go/src/github.com/openshift/cloud-provider-azure/cmd/cloud-controller-manager/app/controllermanager.go:138 +0x27
      k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run.func1()
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:203 +0x1f
      k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc0002c0d80, {0x21bce08, 0xc0001ac008})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:213 +0x14d
      k8s.io/client-go/tools/leaderelection.RunOrDie({0x21bce08, 0xc0001ac008}, {{0x21c0e00, 0xc0002c0c60}, 0x1fe5d61a00, 0x18e9b26e00, 0x60db88400, {0xc000418080, 0x1fc4978, 0x0}, ...})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:226 +0x94
      sigs.k8s.io/cloud-provider-azure/cmd/cloud-controller-manager/app.NewCloudControllerManagerCommand.func1(0xc000170000?, {0x1ea43e2?, 0xd?, 0xd?})
      	/go/src/github.com/openshift/cloud-provider-azure/cmd/cloud-controller-manager/app/controllermanager.go:130 +0x3a7
      github.com/spf13/cobra.(*Command).execute(0xc000170000, {0xc00019e010, 0xd, 0xd})
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/github.com/spf13/cobra/command.go:876 +0x67b
      github.com/spf13/cobra.(*Command).ExecuteC(0xc000170000)
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/github.com/spf13/cobra/command.go:990 +0x3bd
      github.com/spf13/cobra.(*Command).Execute(...)
      	/go/src/github.com/openshift/cloud-provider-azure/vendor/github.com/spf13/cobra/command.go:918
      main.main()
      	/go/src/github.com/openshift/cloud-provider-azure/cmd/cloud-controller-manager/controller-manager.go:47 +0xc5
      

      Expected results:

      Code should exit without panicking
      

      Additional info:

      
      

      Attachments

        Activity

          People

            joelspeed Joel Speed
            joelspeed Joel Speed
            Milind Yadav Milind Yadav
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: