Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2406

MERGE3 not working with TCP using ForkJoinPool

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Minor Minor
    • 4.2.1
    • 4.1.8
    • None
    • Hide
      1. project.zip contains the whole source of the Maven project. Run maven package to create the application jar.
      2. In the project directory, the file Dockerfile will build the docker image. A simple docker build ... is working, no option required.
      3. The project also ships a Helm chart to deploy the created image to Kubernetes. You will have to adapt the following section, according to your own Docker registry:
        image:   repository: activeviam-ps-docker-dev.jfrog.io # Insert your own repository
          name: performance-distribution # Use the image name you defined
          tag: jtest # Use the image tag you defined
          pullPolicy: Always
        

        Running helm upgrade --install jgroups jgroups will deploy the image to your cluster.

      To create the exact same cluster as me, use the following command:
      eksctl create cluster --name eks-ope-dist --version 1.14 --region eu-west-3 --ssh-access --ssh-public-key=eks-ope-dist --nodegroup-name standard-workers --node-type t3.large --nodes 2 --nodes-min 1 --nodes-max 3 --node-ami auto

      I can upload my own Docker image on a public repository if needed.

      Show
      project.zip contains the whole source of the Maven project. Run maven package to create the application jar. In the project directory, the file Dockerfile will build the docker image. A simple docker build ... is working, no option required. The project also ships a Helm chart to deploy the created image to Kubernetes. You will have to adapt the following section, according to your own Docker registry: image: repository: activeviam-ps-docker-dev.jfrog.io # Insert your own repository name: performance-distribution # Use the image name you defined tag: jtest # Use the image tag you defined pullPolicy: Always Running helm upgrade --install jgroups jgroups will deploy the image to your cluster. To create the exact same cluster as me, use the following command: eksctl create cluster --name eks-ope-dist --version 1.14 --region eu-west-3 --ssh-access --ssh-public-key=eks-ope-dist --nodegroup-name standard-workers --node-type t3.large --nodes 2 --nodes-min 1 --nodes-max 3 --node-ami auto I can upload my own Docker image on a public repository if needed.

      With JDK11, using the TCP protocol with the ForkJoinPool is causing constant failures of MERGE3.

      I consistently observed the following, from the point of view of a member M

      • M asks for other coordinator views. It contacts A and B
      • A and B send their views
      • M waits and timeouts for receiving views and abort the merge
      • immediately after aborting the merge, M process messages containing the views of A and B.

      In timeline.txt, you will see the extracts for logs from the various members at play.

      After many experiments, the one parameter causing this issue is in the TCP protocol.

      <TCP
         ...
         thread_pool.use_fork_join_pool="true" />
      

      Setting thread_pool.use_fork_join_pool to true repeatedly produces the problem, while using thread_pool.use_fork_join_pool with false works fine.

      Project details:

      • as tested within Kubernetes, this project uses KUBE_PING as its discovery protocol
      • to understand the reason for the failed merges, I created the protocol MERGE4, that is MERGE3 with additional logs.
      • logs.tgz contains all logs from the various members involved in the test.

        1. logs.tgz
          153 kB
        2. project.zip
          23 kB
        3. timeline.txt
          6 kB

              rhn-engineering-bban Bela Ban
              opeyrusse Olivier Peyrusse (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: