Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-13998

Fix handling of mutable distributable session attributes across concurrent requests when backing cache is non-transactional

    XMLWordPrintable

Details

    • Hide

      See README.md in the attached reproducer.zip

      Show
      See README.md in the attached reproducer.zip
    • Workaround Exists
    • Hide

      There are a few different workarounds, though not all are complete, and not all acceptable for all deployments:

      • Declare attributes to be immutable (via annotation or via distributable-web deployment descriptor) wherever possible.
      • Use tx cache
      • Use SESSION granularity
      Show
      There are a few different workarounds, though not all are complete, and not all acceptable for all deployments: Declare attributes to be immutable (via annotation or via distributable-web deployment descriptor) wherever possible. Use tx cache Use SESSION granularity
    • Undefined
    • ---
    • ---

    Description

      Environment:

      • EAP 7.3.1 (on OpenShift 3.11)
      • 2 nodes cluster
      • replicated-cache (non-transactional + without locking) / Fine (ATTRIBUTE) replication-granularity
                    <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan">
                        <transport lock-timeout="60000"/>
                        <replicated-cache name="repl">
                            <file-store/>
                        </replicated-cache>
                        ...
                    </cache-container>
        

      We have a customer who is facing $subject during their performance test. It resulted in frequent Full GC and they faced a node restart issue due to liveness probe failure which was caused by a long GC pause.

      From the heap dump, we observed that org.infinispan.container.impl.DefaultDataContainer consumed most size of java heap, and many session attributes (org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey) remains in java.util.concurrent.ConcurrentHashMap$Node of the DefaultDataContainer.

      Class Name                                                                       | Shallow Heap | Retained Heap
      ----------------------------------------------------------------------------------------------------------------
      org.infinispan.cache.impl.CacheImpl @ 0x73448fe80                                |          136 |           216
      |- dataContainer org.infinispan.container.impl.DefaultDataContainer @ 0x7344e0760|           56 | 2,312,535,136
      ----------------------------------------------------------------------------------------------------------------
      
      Class Name                                                                                      | Shallow Heap | Retained Heap | Percentage
      --------------------------------------------------------------------------------------------------------------------------------------------
      org.infinispan.container.impl.DefaultDataContainer @ 0x7344e0760                                |           56 | 2,312,535,136 |     94.50%
      |- java.util.concurrent.ConcurrentHashMap @ 0x73451bf38                                         |           64 | 2,312,534,992 |     94.50%
      |  |- java.util.concurrent.ConcurrentHashMap$Node[262144] @ 0x7a14a8dc0                         |    1,048,592 | 2,312,534,840 |     94.50%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b78ec4d8                              |           32 |    37,112,616 |      1.52%
      |  |  |  |- org.infinispan.container.entries.ImmortalCacheEntry @ 0x7b78ec4c0                   |           24 |    37,112,560 |      1.52%
      |  |  |  |- org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey @ 0x7b78fb778|           24 |            24 |      0.00%
      |  |  |  '- Total: 2 entries                                                                    |              |               |           
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7ae9890a0                              |           32 |    30,468,664 |      1.25%
      |  |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b87555a8                           |           32 |    30,437,152 |      1.24%
      |  |  |  |- org.infinispan.container.entries.ImmortalCacheEntry @ 0x7ae989088                   |           24 |        31,288 |      0.00%
      |  |  |  |- org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey @ 0x7ae981268|           24 |           192 |      0.00%
      |  |  |  '- Total: 3 entries                                                                    |              |               |           
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b895fd10                              |           32 |    30,437,152 |      1.24%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b2ae44d8                              |           32 |    17,829,312 |      0.73%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a17dd730                              |           32 |    15,605,776 |      0.64%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7bb547cf0                              |           32 |     7,160,408 |      0.29%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7c01fddc0                              |           32 |     6,673,264 |      0.27%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x79e5cd4e8                              |           32 |     3,770,088 |      0.15%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7bfda04c8                              |           32 |     2,561,448 |      0.10%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a180f878                              |           32 |       164,592 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76bebc9d8                              |           32 |       163,112 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a173ea50                              |           32 |       162,152 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x796c33a38                              |           32 |       133,048 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x740c2cf20                              |           32 |       132,240 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x75019c790                              |           32 |       132,104 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x776181800                              |           32 |       131,960 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76bf19e30                              |           32 |       131,680 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x756d67ff0                              |           32 |       131,360 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a1840158                              |           32 |       131,304 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76bf24e50                              |           32 |       131,224 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7405a7228                              |           32 |       131,200 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x74379b930                              |           32 |       131,176 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x750fbd440                              |           32 |       130,960 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x748b6a050                              |           32 |       130,928 |      0.01%
      |  |  |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76a61c730                              |           32 |       130,624 |      0.01%
      |  |  '- Total: 25 of 190,682 entries; 190,657 more                                             |              |               |           
      

      active session keys are only 9 from the result of OQL:

      select * from org.wildfly.clustering.web.infinispan.session.SessionCreationMetaDataKey
      -> 9 entries
      select * from org.wildfly.clustering.web.infinispan.session.SessionAccessMetaDataKey
      -> 9 entries

      However, there are much more remaining session attributes than expected:

      select * from org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey
      -> 119,741 entries


      After internal investigation, we found this is a thread safety problem in org.wildfly.clustering.web.cache.session.fine.FineSessionAttributes. The new request invalidates the session and removes the attributes from the cache correctly. But the previous request is still running mutators in close() with a cached copy of the mutations list, and adds it again. This results in a memory leak.

      Attachments

        Issue Links

          Activity

            People

              pferraro@redhat.com Paul Ferraro
              pferraro@redhat.com Paul Ferraro
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: