Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-10182

Unstable work of the persistent EJB timers with Oracle DB. ORA-08177: can't serialize access for this transaction

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 14.0.0.Beta2
    • 10.1.0.Final, 11.0.0.Final, 12.0.0.Final
    • EJB
    • None

    Description

      The main signal is an exception in logs:

      ORA-08177: can't serialize access for this transaction
      

      When one cluster node is halted (the one on the witch "Timer" is actually working) then the other nodes are still can't catch up this "timer" from the DB due to the mentioned exception. Even worth scenario - when all nodes work perfectly, but occasionally all nodes start to decline, that timer is in the state IN_TIMEOUT and it really is due to the DB state, just because the last timer's invocation did not manage to commit final state after processing due to the same reason (SQL exception).

      So, why Oracle DB is so special? That's because it never manages transactions by themselves. It's full responsibility of the client.

      Looking into the code of the class

      org.jboss.as.ejb3.timerservice.persistence.database.DatabaseTimerPersistence
      

      I didn't find transaction management, except one method, and even in it, I found, that it probably doesn't manage transaction as desired, due to the lack of resource participation.

      After I've made some changes to enable transactions for that functionality I have managed to stable EJB Timer logic on the server. Let's have a look at the attached patch (it's for 11 version, but it also works for 12).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              maxoid_jira Maxim Karavaev (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: