-
Bug
-
Resolution: Done
-
Major
-
None
Two sites: LON(nodes A,B,C) and NYC(nodes D,E,F). Node A and D are the site master of LON and NYC respectively.
Site LON is going to push state to site NYC. The push state request is done in node B and the request is done when node A is leaving the cluster. In order to push the state, it sends a START_RECEIVE to NYC site.
About to send to backups [NYC (sync, timeout=2000)], command XSiteStateTransferControlCommand{control=START_RECEIVE, siteName='null', statusOk=false, cacheName='___defaultcache'} SiteProviderTopologyChangeTest-NodeB-48243: invoking unicast RPC [req-id=98] on SiteMaster(NYC) SiteProviderTopologyChangeTest-NodeB-48243: forwarding message to final destination SiteMaster(NYC) to the current coordinator SiteProviderTopologyChangeTest-NodeB-48243: sending msg to SiteProviderTopologyChangeTest-NodeA-41240, src=SiteProviderTopologyChangeTest-NodeB-48243, headers are RequestCorrelator: corr_id=200, type=REQ, req_id=98, rsp_expected=true, RELAY2: DATA [dest=SiteMaster(NYC), sender=SiteProviderTopologyChangeTest-NodeB-48243:LON], UNICAST3: DATA, seqno=32, TP: [cluster_name=ISPN(SITE LON)]
The message is forward to node A (site master LON) that sends it to node D (site master NYC)
SiteProviderTopologyChangeTest-NodeA-41240: received [dst: SiteProviderTopologyChangeTest-NodeA-41240, src: SiteProviderTopologyChangeTest-NodeB-48243 (4 headers), size=29 bytes, flags=OOB|NO_TOTAL_ORDER], headers are RequestCorrelator: corr_id=200, type=REQ, req_id=98, rsp_expected=true, RELAY2: DATA [dest=SiteMaster(NYC), sender=SiteProviderTopologyChangeTest-NodeB-48243:LON], UNICAST3: DATA, seqno=32, TP: [cluster_name=ISPN(SITE LON)] _SiteProviderTopologyChangeTest-NodeA-41240:LON: sending msg to _SiteProviderTopologyChangeTest-NodeD-50088:NYC, src=_SiteProviderTopologyChangeTest-NodeA-41240:LON, headers are RequestCorrelator: corr_id=200, type=REQ, req_id=98, rsp_expected=true, RELAY2: DATA [dest=SiteMaster(NYC), sender=SiteProviderTopologyChangeTest-NodeB-48243:LON], UNICAST3: DATA, seqno=2, conn_id=1, TP: [cluster_name=global]
Response is sent back from node D to node A that forwards it to node B.
_SiteProviderTopologyChangeTest-NodeA-41240:LON: received [dst: _SiteProviderTopologyChangeTest-NodeA-41240:LON, src: _SiteProviderTopologyChangeTest-NodeD-50088:NYC (4 headers), size=4 bytes, flags=OOB|NO_TOTAL_ORDER], headers are RequestCorrelator: corr_id=200, type=RSP, req_id=98, rsp_expected=true, RELAY2: DATA [dest=SiteProviderTopologyChangeTest-NodeB-48243:LON, sender=SiteMaster(NYC)], UNICAST3: DATA, seqno=3, TP: [cluster_name=global] SiteProviderTopologyChangeTest-NodeA-41240: sending msg to SiteProviderTopologyChangeTest-NodeB-48243, src=SiteProviderTopologyChangeTest-NodeA-41240, headers are RequestCorrelator: corr_id=200, type=RSP, req_id=98, rsp_expected=true, RELAY2: DATA [dest=SiteProviderTopologyChangeTest-NodeB-48243:LON, sender=SiteMaster(NYC)], UNICAST3: DATA, seqno=30, conn_id=1, TP: [cluster_name=ISPN(SITE LON)]
However, since node A is shutting-down, the response never arrives to node B that ends up throwing TimeoutException.
SiteProviderTopologyChangeTest-NodeA-41240: sending 1 msgs (218 bytes (0.70% of max_bundle_size) to 1 dests(s): [ISPN(SITE LON):SiteProviderTopologyChangeTest-NodeB-48243] 127.0.0.1:7900: server is not running, discarding message to 127.0.0.1:7901
The test will be disabled because:
- This push state is triggered manually and it can be re-triggered in case of exceptions
- It requires some UNICAST/NAKACK between sites (i.e. changing jgroups)