Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-391

Add keep alive component to file locker

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Major Major
    • JBoss A-MQ 6.1
    • JBoss A-MQ 6.0
    • None
    • None

      We encountered an issue on nsfv4 with a master slave configuration, where both the slave and the master could obtain a lock. After extensive research with netapp,rhel support, we determined the following events occurred:master locks the file - does no more i/o to it – it's passive wrt the lockslave asks every 10 seconds if it can get the locknfs come back and say no, someone has itnfs dies not nicely - - nfsv4 is stageful - no callback for locks. It has a grace period 30 seconds to let all clients had locks reclaims lockedmaster does not realize it needs to reclaim the lock and continues under the assumption it has the lock.After 30 sec grace period, slave comes in and asks for the lock and it receives it.After talking to Gary, it should not be too much trouble to extend our locker class to implement a keepalive feature (org.apache.activemq.broker.AbstractLocker#keepAlive[1]) so that the master will check if it the lock is still valid every X seconds. We already have an extension in SharedFileLocker which we an use to implement a java.nio.channels.FileLock#isValid[2].If the lock is invalid, we will need to try and reclaim the lock. I'm thinking we can probably just loop back around to the original logic that says get the lock.I think this would be a good safe guard if nothing else to add to the product.[1]http://activemq.apache.org/maven/apidocs/org/apache/activemq/broker/AbstractLocker.html#keepAlive()
      [2]http://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileLock.html

        1. strace.tiff
          139 kB
          Susan Javurek

              gtully@redhat.com Gary Tully
              rhn-support-sjavurek Susan Javurek
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: