Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-7332

XSite trying to replicate to site after site has been shutdown

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • 9.0.0.Beta1
    • None
    • None

      In a 2-site scenario, with a server in each site, when the server in one of the sites goes down, and as a result the entire site is gone, the initial site might still try to replicate to the other site. Example:

      Sites: EARTH and MOON
      Servers: server-earth-one and server-moon-one respectively

      server-moon-one stops:

      2017-01-04 12:16:38,666 INFO  [org.jboss.as] (MSC service thread 1-4) WFLYSRV0050: 
      Infinispan Server 9.0.0.Beta1 (WildFly Core 2.2.0.Final) stopped in 102ms
      

      server-earth-one realises that and sets the correct view:

      2017-01-04 12:16:38,649 TRACE [org.jgroups.protocols.relay.RELAY2] (jgroups-3,_master:server-earth-one:EARTH) 
      [Relayer _master:server-earth-one:EARTH] view: [_master:server-earth-one:EARTH|4] (1) [_master:server-earth-one:EARTH]
      

      server-earth-one gets a put invocation

      2017-01-04 12:16:38,709 TRACE [org.infinispan.interceptors.impl.InvocationContextInterceptor] (HotRodServerHandler-8-1) 
      Invoked with command PutKeyValueCommand{key=org.infinispan.commons.marshall.WrappedByteArray@a3b01a15, 
      value=org.infinispan.commons.marshall.WrappedByteArray@b68d6067, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, 
      valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=4294967297}}, 
      successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@3d9b13cf]
      

      But for some reason, server-earth-one still tries to send it to the MOON site:

      2017-01-04 12:16:38,713 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (HotRodServerHandler-8-1) 
      About to send to backups [MOON (sync, timeout=10000)], command SingleXSiteRpcCommand{command=PutKeyValueCommand{key=org.infinispan.commons.marshall.WrappedByteArray@a3b01a15, 
      value=org.infinispan.commons.marshall.WrappedByteArray@b68d6067, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, 
      valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=4294967297}}, 
      successful=true}}
      

      ^ That should not happen.

      Moreover, the JGroups layer detects there's no site already:

      2017-01-04 12:16:38,717 ERROR [org.jgroups.protocols.relay.RELAY2] (HotRodServerHandler-8-1) 
      master:server-earth-one: no route to MOON: dropping message
      

      But timeout needs to occur for the put to complete:

      2017-01-04 12:16:48,721 WARN  [org.infinispan.xsite.BackupSenderImpl] (HotRodServerHandler-8-1) 
      ISPN000202: Problems backing up data for cache xsiteCache to site MOON: org.infinispan.util.concurrent.TimeoutException: 
      Timed out after 10 seconds waiting for a response from MOON (sync, timeout=10000)
      ...
      2017-01-04 12:16:48,726 TRACE [org.infinispan.server.hotrod.HotRodEncoder] (HotRodServerWorker-7-1) 
      Encode msg EmptyResponse{version=25, messageId=21, cacheName='xsiteCache', clientIntel=3, operation=PUT, status=Success, topologyId=1}
      

      I'm attaching full TRACE logs.

            Unassigned Unassigned
            rh-ee-galder Galder Zamarreño
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: