Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 12.0.1.Final
Component/s: None
Labels:
None

Steps to Reproduce:
- set up Infinispan with distributed caches and RocksDB persistence
- add several thousand entries to the caches
- call /metrics endpoint
Release Note Text:
Undefined

We're using an Infinispan cluster with 3 nodes to back our Keycloak instances. In addition, RocksDB is used to persist the cache content for disaster recovery.

When RocksDB is used for cache persistence, metrics collection on "/metrics" endpoint takes a very long time and also leads to timeouts in Keycloak which tries to access the Infinispan cluster.

The issue becomes more visible, the more entries are in the caches. With 6 distributed caches (2 owners, 3 nodes) and a total of 60000 entries we observe the following metrics collection duration:

no RocksDB: <1s
RocksDB (not segmented): 124s
RocksDB (segmented: 256): 49s

With even more cache entries (300000), the Infinspan cluster becomes almost not usable when the /metrics endpoint is crawled by Prometheus every 2 minutes.

Interestingly the statistics shown in the Infinispan UI load without problems even when RocksDB is enabled. Only the /metrics endpoint causes trouble.

Please see the attached infinispan.xml file for details about the setup.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

infinispan.xml
4 kB
2021/03/12 11:53 AM

is related to

ISPN-12607 Metrics degrade cluster performance

Closed

Assignee:: Ryan Emerson

Reporter:: Georg F (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2021/03/12 12:00 PM

Updated:: 2021/03/15 4:06 AM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates