• Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 5.9.0.Final
    • 5.8.2.Final
    • LRA
    • None

      SpecIT#acceptTest failed on CI (http://narayanaci1.eng.hst.ams2.redhat.com/job/narayana/PROFILE=MAIN,jdk=jdk8.latest,label=linux/187/) and on one of the PRs.

      ochaloup@redhat.com did some investigation:

      I did some experiments with the lra testing and normally tests pass on my laptop. But as you mentioned it's probably timing issue as when I slow down the network (I used tc as described at [1]) then the tests started to fail[2] (I'm on master branch).

      Do you think you can check it on your laptop if you can see the same issue? Or how to manage that?

      Thanks
      o.

      [1] https://jvns.ca/blog/2017/04/01/slow-down-your-internet-with-tc/
      [2]
      Tests run: 23, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 489.336 sec <<< FAILURE! - in io.narayana.lra.participant.SpecIT
      timeLimitRequiredLRA(io.narayana.lra.participant.SpecIT) Time elapsed: 14.474 sec <<< FAILURE!
      java.lang.AssertionError: timeLimitRequiredLRA: compensate should have been called expected:<1> but was:<0>
      at io.narayana.lra.participant.SpecIT.timeLimitRequiredLRA(SpecIT.java:535)

      acceptTest(io.narayana.lra.participant.SpecIT) Time elapsed: 30.335 sec <<< FAILURE!
      java.lang.AssertionError: expected:<0> but was:<2>
      at io.narayana.lra.participant.SpecIT.joinAndEnd(SpecIT.java:649)
      at io.narayana.lra.participant.SpecIT.acceptTest(SpecIT.java:624)

      connectionHangup(io.narayana.lra.participant.SpecIT) Time elapsed: 35.179 sec <<< FAILURE!
      java.lang.AssertionError: connectionHangup: wrong compensation count after recovery expected:<7> but was:<6>
      at io.narayana.lra.participant.SpecIT.connectionHangup(SpecIT.java:266)

            [JBTM-3028] LRA tests can fail if the network is running slowly

            The delay can result in the recovery process not being able to complete recovery of participants so the test assertions that recovery succeeded fail.

            The fix to the test is to retry the recovery pass a fixed number of times. Note that the delay on every packet is unreasonable and results in recovery never being able to succeed.

            Michael Musgrove added a comment - The delay can result in the recovery process not being able to complete recovery of participants so the test assertions that recovery succeeded fail. The fix to the test is to retry the recovery pass a fixed number of times. Note that the delay on every packet is unreasonable and results in recovery never being able to succeed.

            I'm able to see the failing test when I run this set of commands

            # build narayana
            cd rts/lra/lra-test
            sudo tc qdisc add dev lo root netem delay 500ms
            tc qdisc
            # (observe the rule on delaying network is set up)
            # [root@ochaloup ochaloup]# tc qdisc
            # qdisc netem 8001: dev lo root refcnt 2 limit 1000 delay 500.0ms
            mvn verify -Dit.test=SpecIT#acceptTest
            # (observe failure)
            # Failed tests:
            #    SpecIT.acceptTest:624->joinAndEnd:649 expected:<0> but was:<2>
            sudo tc qdisc del dev lo root netem
            

            Ondrej Chaloupka (Inactive) added a comment - I'm able to see the failing test when I run this set of commands # build narayana cd rts/lra/lra-test sudo tc qdisc add dev lo root netem delay 500ms tc qdisc # (observe the rule on delaying network is set up) # [root@ochaloup ochaloup]# tc qdisc # qdisc netem 8001: dev lo root refcnt 2 limit 1000 delay 500.0ms mvn verify -Dit.test=SpecIT#acceptTest # (observe failure) # Failed tests: # SpecIT.acceptTest:624->joinAndEnd:649 expected:<0> but was:<2> sudo tc qdisc del dev lo root netem

              rhn-engineering-mmusgrov Michael Musgrove
              rhn-engineering-mmusgrov Michael Musgrove
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: