Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-8925

Conflict Resolution can exhaust the ASYNC_TRANSPORT_EXECUTOR

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 9.2.1.Final
    • 9.2.0.Final
    • Core
    • None

      During the execution of AvailabilityStrategyContext#updateTopologiesAfterMerge, it's necessary for a topology update to be sent to the cluster via ClusterTopologyManagerImpl#executeOnClusterAsync, which utilises the ASYNC_TRANSPORT_EXECUTOR, before a call is made to ConflictManager#resolveConflicts. This topology update is vital as it contains the topologyId which all of the conflict resolution RPCs depend on. If this topology update is not sent, then ConflictManager#resolveConflicts will eventually timeout as no progress can be made.

      The problem is that during the entire execution of AvailabilityStrategyContext#doMergePartitions, an ASYNC_TRANSPORT_EXECUTOR thread is occupied. Therefore, when AvailabilityStrategyContext#updateTopologiesAfterMerge is called prior to conflict resolution it's possible that ALL threads are executing runnables that are waiting indefinitely on ConflictManager#resolveConflicts and therefore it's not possible to send the topology update.

      As the number of caches increase the number of doMergePartition runnables on the ASYNC_TRANSPORT_EXECUTOR increases, consequently so does the likelihood of the executor's resources becoming exhausted.

              remerson@redhat.com Ryan Emerson
              remerson@redhat.com Ryan Emerson
              Archiver:
              rhn-support-adongare Amol Dongare

                Created:
                Updated:
                Resolved:
                Archived: