Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-16215

purgeExpired(Postgres) batchDelete and Eventhandling lead to timeouts and loss of scheduled task

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 15.1.0.Dev04, 15.0.9.Final
    • None
    • None
    • None
    • Hide

      Use github project: https://github.com/aschoerk/infinispan15cdidb

      • call mvn clean test multiple times.
      • The test
        • should normally create 5 entries in a H2 backed cache,
        • set the timestamp to 2 hours earlier and
        • wait until no entry is left anymore. (because of purgeExpired)

      The error seems to be related to the usage of a db-connection-pool. Possibly the avoidance of actually closing the connection at db, what a connection pool is doing, bring the problem to be shown.

      Show
      Use github project: https://github.com/aschoerk/infinispan15cdidb call mvn clean test multiple times. The test should normally create 5 entries in a H2 backed cache, set the timestamp to 2 hours earlier and wait until no entry is left anymore. (because of purgeExpired) The error seems to be related to the usage of a db-connection-pool. Possibly the avoidance of actually closing the connection at db, what a connection pool is doing, bring the problem to be shown.

      Using infinispan in Keycloak 24.0.5 (infinispan version 
      14.0.28.Final) leads to exceptions during purging:
       * After the batchUpdate in the current transaction notifications are sent which lead to deleteFromAllStores -> this seams to lead to deletes on a separate connection and therefore a locking situation.
       * the purgeExpired is ended using rollback of the transaction which is commented by postgres as connection already closed (this seams to be a consequence of the locktimeout handling)
       * purgeExpired never gets called anymore after this error.
       * our db admins tell us that there are idle connections with open transactions left
       * agroal leak detection tells after some time, that connection leaks are occurring
       * I tried to reproduce that using H2, it was not possible for me

      I will provide the Stacktraces below: I assume, that you would like some code which might allow you to reproduce the problem.
      I gladly will try to provide that, but I fear, that it will only work using Postgres, please tell me how you would like this to be handled.

      Generally I am not happy how purgeExpired is currently implemented
       * it opens a cursor to an indefinite number of entries 
       * it tries to delete these entries during walking along the cursor, This looks to me like walking a bridge and removing the pillars holding the bridge. I am quite convinced that not all SQL databases do support that
       * The transaction size is not limited. imO. this is unnecessary  here
       * is it really necessary to send notifications during the transaction where it is not completely clear if a distributed transaction situation might occur?

      Assuming, that a solution will not be done as soon as we need it, I created a workaround which subclasses JdbcStringBasedStore. I would gladly provide the code if anybody might be interested in it.
      Nevertheless the loss of the scheduled task in case of errors I am currently not fixing.

      I will add stacktrace, infinispan-config and quarkus-db-config as attachments.stacktraces.txt

            remerson@redhat.com Ryan Emerson
            aschoerk Andreas Schörk (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: