Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-4446

[LTS] Inconsistencies between Replication Catchup and PagingStore.stopPaging();

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • AMQ 7.4.6.GA
    • AMQ 7.8.0.GA
    • broker-core
    • None

      I have identified a few inconsistencies between a network catchup and stop paging from cleanup.

      The two acting together could cause a deadlock, or an inconsistency causing cleanup to never work and disk data to be completely used.

      Also, as part of this issue, I'm removing a blocking operation that is not required.

      It appears that this wait can cause problems and provoke paging for the affected address:

      protected void storeBookmark(ArrayList<PageSubscription> cursorList, Page currentPage) throws Exception {
            try {
               // First step: Move every cursor to the next bookmarked page (that was just created)
               for (PageSubscription cursor : cursorList) {
                  cursor.confirmPosition(new PagePositionImpl(currentPage.getPageId(), -1));
               }
      
               // we just need to make sure the storage is done..
               // if the thread pool is full, we will just log it once instead of looping
               if (!storageManager.waitOnOperations(5000)) {
                  ActiveMQServerLogger.LOGGER.problemCompletingOperations(storageManager.getContext());
               }
            } finally {
               for (PageSubscription cursor : cursorList) {
                  cursor.enableAutoCleanup();
               }
            }
         }
      
      2020-Dec-17 15:25:12,860 WARN  [org.apache.activemq.artemis.core.server] AMQ222024: Could not complete operations on IO context OperationContextImpl [2053956692] [minimalStore=0, storeLineUp=0, stored=0, minimalReplicated=1, replicationLineUp=1, replicated=0, paged=0, minimalPage=0, pageLineUp=0, errorCode=-1, errorMessage=null, executorsPending=0, executor=OrderedExecutor(tasks=[])]Task = TaskHolder [storeLined=0, replicationLined=1, pageLined=0, task=IOCallback(PageSubscriptionImpl) ]
      Task = TaskHolder [storeLined=0, replicationLined=1, pageLined=0, task=IOCallback(PageSubscriptionImpl) ]
      Task = TaskHolder [storeLined=0, replicationLined=1, pageLined=0, task=org.apache.activemq.artemis.core.journal.impl.SimpleWaitIOCallback]
      

      Please look into whether this wait is needed and whether it could cause any harmful sid-effects.

              csuconic@redhat.com Clebert Suconic
              rhn-support-dhawkins Duane Hawkins
              Tiago Bueno Tiago Bueno
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: