Uploaded image for project: 'WildFly WIP'
  1. WildFly WIP
  2. WFWIP-205

tx recovery intermittently fails after jvm crash

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • OpenShift
    • None
    • Hide

      it can be reproduced using test:

      git clone git@gitlab.mw.lab.eng.bos.redhat.com:msimka/openshift-eap-tests.git
      cd openshift-eap-tests.git
      git checkout EAP7-1192_scaledown
      # create file test.properties
      cd test-eap
      mvn clean test -P72 
      
      # run multiple times
      -Dtest=EjbTxnRemotingCrashRecTest#testTxStatelessServerSecondCommitJvmHalt -Dcheckstyle.skip -Dconsole-log-level=DEBUG
      

      test.properties

      xtf.openshift.url=https://master.all-in-one-msimka-002.dynamic.xpaas:8443
      xtf.openshift.namespace=wip-namespace
      xtf.bm.namespace=wip-builds-namespace
      
      xtf.eap.72.image=docker-registry.engineering.redhat.com/ochaloup/wildfly18-snapshot:190909-d4ddf04cc2-wfcore-10.0.0.Beta7-SNAPSHOT
      xtf.eap.72.properties.eap.imagestream.name=jboss-eap73-openshift
      
      xtf.eap.72.version=7.2.0.GA
      xtf.eap.properties.location=/opt/eap
      xtf.eap.72.templates.repo=git://github.com/jboss-container-images/jboss-eap-7-openshift-image.git,git://github.com/jboss-container-images/redhat-sso-7-openshift-image.git
      xtf.eap.72.templates.branch=eap72,v7.3.0.GA
      
      xtf.maven.proxy.url=http://maven.all-in-one-043.dynamic.xpaas/nexus/content/groups/public/
      
      # this might be needed if oc login command fails, see test suite logs
      #xtf.openshift.binary.path=/home/msimka/.minishift/cache/oc/v3.11.0/linux/oc
      xtf.operator.image=docker-registry.engineering.redhat.com/jbossqe-eap/wildfly-operator:EAP7-1192-txn-recovery-issue70
      
      # specify correct path oc binary
      xtf.openshift.binary.path=<oc_binary_path>
      

      For master.all-in-one-msimka-002.dynamic.xpaas to work DNS 10.0.144.45 needs to be used.

      Show
      it can be reproduced using test: git clone git@gitlab.mw.lab.eng.bos.redhat.com:msimka/openshift-eap-tests.git cd openshift-eap-tests.git git checkout EAP7-1192_scaledown # create file test.properties cd test-eap mvn clean test -P72 # run multiple times -Dtest=EjbTxnRemotingCrashRecTest#testTxStatelessServerSecondCommitJvmHalt -Dcheckstyle.skip -Dconsole-log-level=DEBUG test.properties xtf.openshift.url=https: //master.all-in-one-msimka-002.dynamic.xpaas:8443 xtf.openshift.namespace=wip-namespace xtf.bm.namespace=wip-builds-namespace xtf.eap.72.image=docker-registry.engineering.redhat.com/ochaloup/wildfly18-snapshot:190909-d4ddf04cc2-wfcore-10.0.0.Beta7-SNAPSHOT xtf.eap.72.properties.eap.imagestream.name=jboss-eap73-openshift xtf.eap.72.version=7.2.0.GA xtf.eap.properties.location=/opt/eap xtf.eap.72.templates.repo=git: //github.com/jboss-container-images/jboss-eap-7-openshift-image.git,git://github.com/jboss-container-images/redhat-sso-7-openshift-image.git xtf.eap.72.templates.branch=eap72,v7.3.0.GA xtf.maven.proxy.url=http: //maven.all-in-one-043.dynamic.xpaas/nexus/content/groups/ public / # this might be needed if oc login command fails, see test suite logs #xtf.openshift.binary.path=/home/msimka/.minishift/cache/oc/v3.11.0/linux/oc xtf. operator .image=docker-registry.engineering.redhat.com/jbossqe-eap/wildfly- operator :EAP7-1192-txn-recovery-issue70 # specify correct path oc binary xtf.openshift.binary.path=<oc_binary_path> For master.all-in-one-msimka-002.dynamic.xpaas to work DNS 10.0.144.45 needs to be used.
    • High

      While testing tx recovery in OpenShift I see that recovery after JVM crash intermittently fails

      Scenario:

      ejb client (app tx-client, pod tx-client-0):

      • EJB business method
        • lookup remote EJB
        • enlist XA resource 1 to transaction
        • enlist XA resource 2 to transaction
        • call remote EJB

      ejb server (app tx-server, pod tx-server-0):

      • EJB business method
        • enlist XA resource 1 to transaction
        • enlist XA resource 2 to transaction

      ejb server XA resource 2 crashes JVM in commit method phase.

      Test waits until crashed pod is restarted, then forces periodic recovery twice and then checks that transaction log store is empty. But it is not empty.

      Attached are logs from client and server pods.

      It seems that it can be partially mitigated by clearing openshift namespace before test (oc delete all --all). But it makes it just less frequent.

        1. wildfly-operator-668fd79fb5-8chs8.log
          24 kB
        2. tx-server-1.log
          208 kB
        3. tx-server-0.log
          277 kB
        4. tx-client-0.log
          286 kB

              ochaloup@redhat.com Ondrej Chaloupka (Inactive)
              msimka@redhat.com Martin Simka
              Martin Simka Martin Simka
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: