Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-20261

NPE in race condition between a thread committing a transaction and another thread performing recovery

XMLWordPrintable

    • Hide

      Steps to reproduce (unfortunately it might be failing intermittently):

      • Download the latest Wildfly 35.0.0.Final (or built from WildFly main branch) and unzip it.
      • Set JBOSS_HOME env variable to unzipped directory.
      • git clone -b master git@gitlab.cee.redhat.com:jbossqe-eap/tests-transactions.git
      • cd tests-transactions
      • mvn clean verify -Dtest=org.jboss.as.test.jbossts.crashrec.txpropagation.test.TxPropagationJMSCrashRecoveryTestCase#noneRaceBetweenTxManagerAndRecovery -Dsurefire.failIfNoSpecifiedTests=false -Djboss.dist=${JBOSS_HOME} -Dcategory=TxPropagTests -Dskip-download-sources -U --batch-mode -Dmaven.test.failure.ignore=true -Dsurefire.test.failure.ignore=true --fail-never -Dversion.server=35.0.0.Final -Dts.timeout.factor=150 -Djbossts.noJTA
      Show
      Steps to reproduce (unfortunately it might be failing intermittently): Download the latest Wildfly 35.0.0.Final (or built from WildFly main branch) and unzip it. Set JBOSS_HOME env variable to unzipped directory. git clone -b master git@gitlab.cee.redhat.com:jbossqe-eap/tests-transactions.git cd tests-transactions mvn clean verify -Dtest=org.jboss.as.test.jbossts.crashrec.txpropagation.test.TxPropagationJMSCrashRecoveryTestCase#noneRaceBetweenTxManagerAndRecovery -Dsurefire.failIfNoSpecifiedTests=false -Djboss.dist=${JBOSS_HOME} -Dcategory=TxPropagTests -Dskip-download-sources -U --batch-mode -Dmaven.test.failure.ignore=true -Dsurefire.test.failure.ignore=true --fail-never -Dversion.server=35.0.0.Final -Dts.timeout.factor=150 -Djbossts.noJTA
    • Regression
    • ---
    • ---

      In a scenario involving a race condition between committing a transaction and periodic recovery, a NullPointerException (NPE) may occur, causing the periodic recovery process to crash.

      Test scenario:
      Test for race condition between a thread committing a transaction and another thread performing recovery. When the recovery thread kicks in, the transaction has just been prepared. By the time the recovery thread gets to the point of working with the transaction record, the record has been removed as the result of successful completion of the commit.

      Expected Result:
      Two-phase commit is proceeded without any error in the log.

      Actual Result:
      There is NPE in the server.log:

      2025-01-09 09:08:18,286 ERROR [stderr] (Periodic Recovery) Exception in thread "Periodic Recovery" java.lang.NullPointerException: Cannot invoke "com.arjuna.ats.arjuna.recovery.RecoverAtomicAction.hasFailedParticipants()" because "rcvAtomicAction" is null
      2025-01-09 09:08:18,286 ERROR [stderr] (Periodic Recovery)      at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.processTransactionsStatus(AtomicActionRecoveryModule.java:238)
      2025-01-09 09:08:18,286 ERROR [stderr] (Periodic Recovery)      at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.periodicWorkSecondPass(AtomicActionRecoveryModule.java:83)
      2025-01-09 09:08:18,286 ERROR [stderr] (Periodic Recovery)      at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:830)
      2025-01-09 09:08:18,286 ERROR [stderr] (Periodic Recovery)      at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:386)
      

      which causes Periodic Recovery to crash.

      I've bisected the commit in Narayna which is the cause of this issue:
      https://github.com/jbosstm/narayana/commit/71abbb0334f5742f968e49405e5fab3069a53833

              jfinelli@redhat.com Manuel Finelli
              mnovak1@redhat.com Miroslav Novak
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: