Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-19519

Intermittent failures in JMSQueueManagementTestCase.removeJMSQueueRemovesAllMessages

XMLWordPrintable

      https://ci.wildfly.org/project.html?projectId=WF&buildTypeId=&tab=testDetails&testNameId=-8166735170741148360&order=TEST_STATUS_DESC&branch_WF=__all_branches__&itemsCount=50

      java.lang.AssertionError: expected:<1> but was:<2>
      	at org.junit.Assert.fail(Assert.java:89)
      	at org.junit.Assert.failNotEquals(Assert.java:835)
      	at org.junit.Assert.assertEquals(Assert.java:647)
      	at org.junit.Assert.assertEquals(Assert.java:633)
      	at org.jboss.as.test.integration.messaging.mgmt.JMSQueueManagementTestCase.removeJMSQueueRemovesAllMessages(JMSQueueManagementTestCase.java:501)
      

      What's odd is the queue is scoped to a single test method, and removeJMSQueueRemovesAllMessages only sends a single message. So having a value of '2' indicates a flaw of some sort in the count-messages logic.

      I suspect it's a race affecting QueueImpl:

      public long getMessageCount() {
            if (pageSubscription != null) {
               // messageReferences will have depaged messages which we need to discount from the counter as they are
               // counted on the pageSubscription as well
               long returnValue = (long) pendingMetrics.getNonPagedMessageCount() + scheduledDeliveryHandler.getNonPagedScheduledCount() + deliveringMetrics.getNonPagedMessageCount() + pageSubscription.getMessageCount();
               if (logger.isDebugEnabled()) {
                  logger.debug("Queue={}/{} returning getMessageCount \n\treturning {}. \n\tpendingMetrics.getMessageCount() = {}, \n\tgetScheduledCount() = {}, \n\tpageSubscription.getMessageCount()={}, \n\tpageSubscription.getCounter().getValue()={}",
                              name, id, returnValue, pendingMetrics.getMessageCount(),  scheduledDeliveryHandler.getNonPagedScheduledCount(), pageSubscription.getMessageCount(), pageSubscription.getCounter().getValue());
               }
               return returnValue;
            } else {
               return (long) pendingMetrics.getMessageCount() + getScheduledCount() + getDeliveringCount();
            }
         }
      

      That logic is adding together counts from a number of sources, so if it is called while a bit of bookkeeping is going on, the message may be double counted.

      Testing the initial message count isn't really the point of removeJMSQueueRemovesAllMessages, so I see a couple possible workarounds:

      1) Remove the consumer from the test method. It's never used but its presence may be a factor in the double-bookkeeping.

      2) Change the assert that that initial count is 1 to an assert that it's > 0. The assert is really a sanity check that there was something there that later got removed.

      I think 2) is better, as the presence of the consumer might impact the cleanup processing that the test is checking. So removing it may affect the conditions being tested.

            bstansbe@redhat.com Brian Stansberry
            bstansbe@redhat.com Brian Stansberry
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: