Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-20266

[AWS] Unit tests have deadlock condition in termination handler

XMLWordPrintable

    • No
    • CLOUD Sprint 243
    • 1
    • False
    • Hide

      None

      Show
      None
    • N/A
    • Release Note Not Required

      Description of problem:

      Due to the way that the termination handlers unit tests are configured, it is possible in some cases for the counter of http requests to the mock handler can cause the test to deadlock and time out. This happens randomly as the ordering of the tests has an effect on when the bug occurs.
      

      Version-Release number of selected component (if applicable):

      4.13+
      

      How reproducible:

      It happens randomly when run in CI, or when the full suite is run. But if the tests are focused it will happen every time.
      Focusing on "poll URL cannot be reached" will exploit the unit test.
      

      Steps to Reproduce:

      1. add `-focus "poll URL cannot be reached"` to unit test ginkgo arguments
      2. run `make unit`
      

      Actual results:

      test suite hangs after this output:
      "Handler Suite when running the handler when polling the termination endpoint and the poll URL cannot be reached should return an error /home/mike/dev/machine-api-provider-aws/pkg/termination/handler_test.go:197"
      

      Expected results:

      Tests pass
      

      Additional info:

      to fix this we need to isolate the test in its own context block, this patch should do the trick:
      
      diff --git a/pkg/termination/handler_test.go b/pkg/termination/handler_test.go
      index 2b98b08b..0f85feae 100644
      --- a/pkg/termination/handler_test.go
      +++ b/pkg/termination/handler_test.go
      @@ -187,7 +187,9 @@ var _ = Describe("Handler Suite", func() {
                                              Consistently(nodeMarkedForDeletion(testNode.Name)).Should(BeFalse())
                                      })
                              })
      +               })
       
      +               Context("when the termination endpoint is not valid", func() {
                              Context("and the poll URL cannot be reached", func() {
                                      BeforeEach(func() {
                                              nonReachable := "abc#1://localhost"
      
      

            mimccune@redhat.com Michael McCune
            mimccune@redhat.com Michael McCune
            Zhaohua Sun Zhaohua Sun
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: