Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: Test Suite
Labels:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

During my work on ~~CLOUD-2261~~ I found instabilities in the PostgreSQLXARecoveryWithNFSDisconnectLoadTest. I was trying to tune the test a bit and investigate and I would like to summarize my findings. Nevertheless I haven't tracked down to the root cause or being able to fix it.

My expectation from the test: have a request load to the service, disconnect a pod, set up network again, stop sending requests, wait for recovery handling having time to get system to the consistent state and check the results.

What I think is currently an issue is fact that processing requests do not finish at time the clients (`HttpWorker`) are stopped (https://gitlab.cee.redhat.com/xpaas-qe/xpaas-qe/blob/master/test-eap/src/test/java/com/redhat/xpaas/eap/xa/load/AbstractSQLXARecoveryLoadTest.java#L145) then there is still long time (in minutes, like 5+) while requests are still processed. I haven't find who is pooling them or why they are processed so long. If there is waiting for all request being processed (like 10 min wait) and then recovery is left to make system consistent, it seems the system runs the test fine and in stable mode.

is related to

CLOUD-2261 [EAP][XA][Recovery][NFS] split lock is broken after a minute of network partition

Closed

relates to

CLOUD-2519 Enhancing the transaction crash recovery tests with failures on resource managers availability after scale down

Assignee:: Tomas Remes

Reporter:: Ondrej Chaloupka (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2018/04/18 9:09 AM

Updated:: 2021/10/24 6:01 AM

Resolved:: 2018/05/16 2:37 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates