-
Bug
-
Resolution: Done
-
Critical
-
21.0.0.Final
-
None
-
-
Workaround Exists
-
-
Undefined
-
---
-
---
Environment:
- EAP 7.3.1 (on OpenShift 3.11)
- 2 nodes cluster
- replicated-cache (non-transactional + without locking) / Fine (ATTRIBUTE) replication-granularity
<cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <replicated-cache name="repl"> <file-store/> </replicated-cache> ... </cache-container>
We have a customer who is facing $subject during their performance test. It resulted in frequent Full GC and they faced a node restart issue due to liveness probe failure which was caused by a long GC pause.
From the heap dump, we observed that org.infinispan.container.impl.DefaultDataContainer consumed most size of java heap, and many session attributes (org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey) remains in java.util.concurrent.ConcurrentHashMap$Node of the DefaultDataContainer.
Class Name | Shallow Heap | Retained Heap
----------------------------------------------------------------------------------------------------------------
org.infinispan.cache.impl.CacheImpl @ 0x73448fe80 | 136 | 216
|- dataContainer org.infinispan.container.impl.DefaultDataContainer @ 0x7344e0760| 56 | 2,312,535,136
----------------------------------------------------------------------------------------------------------------
Class Name | Shallow Heap | Retained Heap | Percentage
--------------------------------------------------------------------------------------------------------------------------------------------
org.infinispan.container.impl.DefaultDataContainer @ 0x7344e0760 | 56 | 2,312,535,136 | 94.50%
|- java.util.concurrent.ConcurrentHashMap @ 0x73451bf38 | 64 | 2,312,534,992 | 94.50%
| |- java.util.concurrent.ConcurrentHashMap$Node[262144] @ 0x7a14a8dc0 | 1,048,592 | 2,312,534,840 | 94.50%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b78ec4d8 | 32 | 37,112,616 | 1.52%
| | | |- org.infinispan.container.entries.ImmortalCacheEntry @ 0x7b78ec4c0 | 24 | 37,112,560 | 1.52%
| | | |- org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey @ 0x7b78fb778| 24 | 24 | 0.00%
| | | '- Total: 2 entries | | |
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7ae9890a0 | 32 | 30,468,664 | 1.25%
| | | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b87555a8 | 32 | 30,437,152 | 1.24%
| | | |- org.infinispan.container.entries.ImmortalCacheEntry @ 0x7ae989088 | 24 | 31,288 | 0.00%
| | | |- org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey @ 0x7ae981268| 24 | 192 | 0.00%
| | | '- Total: 3 entries | | |
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b895fd10 | 32 | 30,437,152 | 1.24%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7b2ae44d8 | 32 | 17,829,312 | 0.73%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a17dd730 | 32 | 15,605,776 | 0.64%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7bb547cf0 | 32 | 7,160,408 | 0.29%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7c01fddc0 | 32 | 6,673,264 | 0.27%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x79e5cd4e8 | 32 | 3,770,088 | 0.15%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7bfda04c8 | 32 | 2,561,448 | 0.10%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a180f878 | 32 | 164,592 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76bebc9d8 | 32 | 163,112 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a173ea50 | 32 | 162,152 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x796c33a38 | 32 | 133,048 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x740c2cf20 | 32 | 132,240 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x75019c790 | 32 | 132,104 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x776181800 | 32 | 131,960 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76bf19e30 | 32 | 131,680 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x756d67ff0 | 32 | 131,360 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7a1840158 | 32 | 131,304 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76bf24e50 | 32 | 131,224 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x7405a7228 | 32 | 131,200 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x74379b930 | 32 | 131,176 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x750fbd440 | 32 | 130,960 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x748b6a050 | 32 | 130,928 | 0.01%
| | |- java.util.concurrent.ConcurrentHashMap$Node @ 0x76a61c730 | 32 | 130,624 | 0.01%
| | '- Total: 25 of 190,682 entries; 190,657 more | | |
active session keys are only 9 from the result of OQL:
select * from org.wildfly.clustering.web.infinispan.session.SessionCreationMetaDataKey
-> 9 entries
select * from org.wildfly.clustering.web.infinispan.session.SessionAccessMetaDataKey
-> 9 entries
However, there are much more remaining session attributes than expected:
select * from org.wildfly.clustering.web.infinispan.session.fine.SessionAttributeKey
-> 119,741 entries
After internal investigation, we found this is a thread safety problem in org.wildfly.clustering.web.cache.session.fine.FineSessionAttributes. The new request invalidates the session and removes the attributes from the cache correctly. But the previous request is still running mutators in close() with a cached copy of the mutations list, and adds it again. This results in a memory leak.
- clones
-
JBEAP-20396 [GSS](7.3.z) Many SessionAttributeKey objects remain in org.infinispan.container.impl.DefaultDataContainer even after session invalidation or expiration
- Closed