-
Bug
-
Resolution: Done
-
Major
-
12.1.7.Final, 13.0.0.Final
When a CacheJoinCommand has an older view, the coordinator is supposed to return a null to the joiner in order to let it know that it should retry after receiving the next view.
ClusterTopologyManagerImpl.handleJoin() has a bug and it tries to compose a null CompletionStage instead of a CompletionStage(null), causing a NullPointerException
00:51:02,114 WARN (jgroups-9,StateTransferOverwriteTest-NodeA:[]) [CLUSTER] ISPN000071: Caught exception when handling command TopologyJoinCommand{cacheName='dist', origin=StateTransferOverwriteTest-NodeB, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.impl.SyncConsistentHashFactory@ffffd8e9, numSegments=256, numOwners=2, timeout=240000, cacheMode=DIST_SYNC, persistentUUID=c8f6078e-bf38-4c53-8cd2-67759437c5d3, persistentStateChecksum=Optional.empty}, viewId=1} java.lang.NullPointerException: null at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1106) [?:?] at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) [?:?] at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:143) [?:?] at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:225) ~[classes/:?] at org.infinispan.commands.topology.CacheJoinCommand.invokeAsync(CacheJoinCommand.java:42) ~[classes/:?]
The joiner expects a CacheJoinException, so it morphs the NullPointerException into a ClassCastException:
org.infinispan.commons.CacheConfigurationException: Error starting component org.infinispan.statetransfer.StateTransferManager Caused by: java.lang.ClassCastException: class java.lang.NullPointerException cannot be cast to class org.infinispan.topology.CacheJoinException (java.lang.NullPointerException is in module java.base of loader 'bootstrap'; org.infinispan.topology.CacheJoinException is in unnamed module of loader 'app') at org.infinispan.topology.LocalTopologyManagerImpl.lambda$sendJoinRequest$3(LocalTopologyManagerImpl.java:191)
The bug causes random failures in StateTransferOverwriteTest.createBeforeMethod, e.g.
https://ci.infinispan.org/job/Infinispan/job/PR-9462/1/testReport/junit/org.infinispan.distribution.rehash/StateTransferOverwriteTest/createBeforeMethod/