Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-12815

Infinispan node crashes after EXCEPTION_ACCESS_VIOLATION from RocksDB JNI

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Blocker
    • 12.1.0.Final
    • 12.0.1.Final
    • None
    • None

    Description

      Hello,

      we have a cache with following configuration:
       

      {
      	"stats": {
      		"time_since_start": 71890,
      		"time_since_reset": 71890,
      		"current_number_of_entries": 2508843,
      		"current_number_of_entries_in_memory": 1000,
      		"total_number_of_entries": 1244610,
      		"off_heap_memory_used": 0,
      		"data_memory_used": 0,
      		"stores": 1244610,
      		"retrievals": 1244209,
      		"hits": 4,
      		"misses": 1244205,
      		"remove_hits": 0,
      		"remove_misses": 0,
      		"evictions": 2507801,
      		"average_read_time": 1,
      		"average_read_time_nanos": 1592374,
      		"average_write_time": 0,
      		"average_write_time_nanos": 0,
      		"average_remove_time": 0,
      		"average_remove_time_nanos": 0,
      		"required_minimum_number_of_nodes": 7
      	},
      	"size": 12422813,
      	"configuration": {
      		"distributed-cache": {
      			"mode": "SYNC",
      			"remote-timeout": 120000,
      			"encoding": {
      				"key": {
      					"media-type": "text/plain"
      				},
      				"value": {
      					"media-type": "application/json"
      				}
      			},
      			"transaction": {
      				"locking": "OPTIMISTIC",
      				"mode": "BATCH"
      			},
      			"memory": {
      				"storage": "BINARY",
      				"max-count": 1000,
      				"when-full": "REMOVE"
      			},
      			"persistence": {
      				"rocksdb-store": {
      					"preload": true,
      					"segmented": false,
      					"path": "__REDACTED__",
      					"compression-type": "LZ4HC",
      					"expiration": {
      						"path": "__REDACTED__"
      					}
      				}
      			},
      			"locking": {
      				"concurrency-level": 512,
      				"isolation": "READ_COMMITTED",
      				"acquire-timeout": 30000,
      				"striping": false
      			},
      			"statistics": true
      		}
      	},
      	"rehash_in_progress": false,
      	"bounded": true,
      	"indexed": false,
      	"persistent": true,
      	"transactional": true,
      	"secured": false,
      	"has_remote_backup": false,
      	"indexing_in_progress": false,
      	"statistics": true,
      	"queryable": false
      }
      

      Our Infinispan deployment consists of 12 nodes. We have encountered an issue at our customer, where their environment started failing on EXCEPTION_ACCESS_VIOLATION which originates from RocksDB JNI.

      We have tried to replicate this issue in our development environment. We have deployed a 12 node Infinispan 12.0.1 cluster and started load testing the environment. During approximately 19 hours, 4 of those 12 nodes failed on EXCEPTION_ACCESS_VIOLATION. First fail was after 5-6 hours. You can see the log of one of the crashes in the attachment.

      We previously used Infinispan 9.4 but we got same crashes with that version, so we have decided to try v12, but unfortunately nothing changed.

      Is there something that is incorrectly configured or is this indeed an issue in RocksDB JNI?

      Thanks

      Attachments

        Activity

          People

            ttarrant@redhat.com Tristan Tarrant
            davidkaya Dávid Kaya (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: