-
Bug
-
Resolution: Obsolete
-
Major
-
None
-
9.0.0.Final
-
None
-
None
In a 2-site scenario, with a server in each site, when the server in one of the sites goes down, and as a result the entire site is gone, the initial site might still try to replicate to the other site. Example:
Sites: EARTH and MOON
Servers: server-earth-one and server-moon-one respectively
server-moon-one stops:
2017-01-04 12:16:38,666 INFO [org.jboss.as] (MSC service thread 1-4) WFLYSRV0050: Infinispan Server 9.0.0.Beta1 (WildFly Core 2.2.0.Final) stopped in 102ms
server-earth-one realises that and sets the correct view:
2017-01-04 12:16:38,649 TRACE [org.jgroups.protocols.relay.RELAY2] (jgroups-3,_master:server-earth-one:EARTH) [Relayer _master:server-earth-one:EARTH] view: [_master:server-earth-one:EARTH|4] (1) [_master:server-earth-one:EARTH]
server-earth-one gets a put invocation
2017-01-04 12:16:38,709 TRACE [org.infinispan.interceptors.impl.InvocationContextInterceptor] (HotRodServerHandler-8-1) Invoked with command PutKeyValueCommand{key=org.infinispan.commons.marshall.WrappedByteArray@a3b01a15, value=org.infinispan.commons.marshall.WrappedByteArray@b68d6067, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=4294967297}}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@3d9b13cf]
But for some reason, server-earth-one still tries to send it to the MOON site:
2017-01-04 12:16:38,713 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (HotRodServerHandler-8-1) About to send to backups [MOON (sync, timeout=10000)], command SingleXSiteRpcCommand{command=PutKeyValueCommand{key=org.infinispan.commons.marshall.WrappedByteArray@a3b01a15, value=org.infinispan.commons.marshall.WrappedByteArray@b68d6067, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=4294967297}}, successful=true}}
^ That should not happen.
Moreover, the JGroups layer detects there's no site already:
2017-01-04 12:16:38,717 ERROR [org.jgroups.protocols.relay.RELAY2] (HotRodServerHandler-8-1) master:server-earth-one: no route to MOON: dropping message
But timeout needs to occur for the put to complete:
2017-01-04 12:16:48,721 WARN [org.infinispan.xsite.BackupSenderImpl] (HotRodServerHandler-8-1) ISPN000202: Problems backing up data for cache xsiteCache to site MOON: org.infinispan.util.concurrent.TimeoutException: Timed out after 10 seconds waiting for a response from MOON (sync, timeout=10000) ... 2017-01-04 12:16:48,726 TRACE [org.infinispan.server.hotrod.HotRodEncoder] (HotRodServerWorker-7-1) Encode msg EmptyResponse{version=25, messageId=21, cacheName='xsiteCache', clientIntel=3, operation=PUT, status=Success, topologyId=1}
I'm attaching full TRACE logs.
- is related to
-
ISPN-9113 SITE_UNREACHABLE not handled by JGroupsTransport
- Closed