Uploaded image for project: 'Red Hat Data Grid'
  1. Red Hat Data Grid
  2. JDG-7698

SIFS can hang on various methods when a non owned segment is removed again

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • RHDG 8.5.5 GA
    • None
    • Persistence
    • None

      Describe the bug

      When starting 4 servers, accessing the metrics endpoint causes a Cache exception (in persistence mode).

      Specific reproduction steps:

      1. Server1

      Startup command:
      bin/server.sh \ --bind-address=0.0.0.0 \ --cluster-name=test \ --cluster-stack=tcp \ --node-name=$HOSTNAME
       
      Create a Cache instance with the following configuration:
      { "Test": { "distributed-cache": { "owners": "2", "mode": "SYNC", "statistics": true, "encoding":

      { "media-type": "application/x-protostream" }

      , "memory":

      { "max-count": "100", "when-full": "REMOVE" }

      , "persistence": { "passivation": false, "file-store": { "data":

      { "path": "data" }

      , "index":

      { "path": "index" }


      }
      }
      }
      }
      }
        * Accessing http://server1:11222/metrics works normally.

      1. Server2
      1. Server3
      1. Server4

      At this point, data cannot be written to the Cache, and the error is as follows.

      Stack Trace:
      "blocking-thread-xxx-p2-t2" #74 [51021] daemon prio=5 os_prio=0 cpu=86.64ms elapsed=227.73s tid=0x000056224dbb5810 nid=51021 waiting on condition [0x00007ff779e3d000] java.lang.Thread.State: TIMED_WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@21.0.7/Native Method)

      • parking to wait for <0x00000000e5b98eb8> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.parkNanos(java.base@21.0.7/LockSupport.java:269) at java.util.concurrent.CompletableFuture$Signaller.block(java.base@21.0.7/CompletableFuture.java:1866) at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@21.0.7/ForkJoinPool.java:3778) at java.util.concurrent.ForkJoinPool.managedBlock(java.base@21.0.7/ForkJoinPool.java:3723) at java.util.concurrent.CompletableFuture.timedGet(java.base@21.0.7/CompletableFuture.java:1939) at java.util.concurrent.CompletableFuture.get(java.base@21.0.7/CompletableFuture.java:2095) at org.infinispan.commons.util.concurrent.CompletableFutures.await(CompletableFutures.java:122) at org.infinispan.commons.util.concurrent.CompletionStages.join(CompletionStages.java:89) at org.infinispan.interceptors.impl.CacheWriterInterceptor.getNumberOfPersistedEntries(CacheWriterInterceptor.java:456) at org.infinispan.interceptors.impl.CorePackageImpl$$Lambda/0x00007ff78929ac20.apply(Unknown Source) at org.infinispan.commons.stat.GaugeMetricInfo.lambda$getGauge$0(GaugeMetricInfo.java:28) at org.infinispan.commons.stat.GaugeMetricInfo$$Lambda/0x00007ff78952c678.get(Unknown Source) at io.micrometer.core.instrument.Gauge.lambda$builder$0(Gauge.java:58) at io.micrometer.core.instrument.Gauge$$Lambda/0x00007ff7894795a0.applyAsDouble(Unknown Source) at io.micrometer.core.instrument.StrongReferenceGaugeFunction.applyAsDouble(StrongReferenceGaugeFunction.java:48) at io.micrometer.core.instrument.internal.DefaultGauge.value(DefaultGauge.java:53) at io.micrometer.prometheusmetrics.PrometheusMeterRegistry.lambda$newGauge$12(PrometheusMeterRegistry.java:338) at io.micrometer.prometheusmetrics.PrometheusMeterRegistry$$Lambda/0x00007ff789476ef8.samples(Unknown Source) at io.micrometer.prometheusmetrics.MicrometerCollector.collect(MicrometerCollector.java:77) at io.prometheus.metrics.model.registry.PrometheusRegistry.scrape(PrometheusRegistry.java:84) at io.prometheus.metrics.model.registry.PrometheusRegistry.scrape(PrometheusRegistry.java:66) at io.micrometer.prometheusmetrics.PrometheusMeterRegistry.scrape(PrometheusMeterRegistry.java:166) at io.micrometer.prometheusmetrics.PrometheusMeterRegistry.scrape(PrometheusMeterRegistry.java:139) at org.infinispan.metrics.impl.MetricsRegistryImpl$PrometheusRegistry.scrape(MetricsRegistryImpl.java:351) at org.infinispan.metrics.impl.MetricsRegistryImpl.scrape(MetricsRegistryImpl.java:177) at org.infinispan.rest.resources.MetricsResource.lambda$metrics$0(MetricsResource.java:60) at org.infinispan.rest.resources.MetricsResource$$Lambda/0x00007ff7897c0ae8.get(Unknown Source) at java.util.concurrent.CompletableFuture$AsyncSupply.run(java.base@21.0.7/CompletableFuture.java:1768) at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18) at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2516) at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2495) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1521) at java.lang.Thread.runWith(java.base@21.0.7/Thread.java:1596) at java.lang.Thread.run(java.base@21.0.7/Thread.java:1583)
         
        After modifying the code, everything works properly.
        // org.infinispan.interceptors.impl.CacheWriterInterceptor#getNumberOfPersistedEntriespublic int getNumberOfPersistedEntries() {// long size = CompletionStages.join(persistenceManager.size());// return (int) Math.min(size, Integer.MAX_VALUE);return -1; }

         
        Infinispan version: infinispan-server-15.2.4.Final

       

              Unassigned Unassigned
              rhn-support-afield Alan Field
              Anna Manukyan Anna Manukyan
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: