Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-10194

After a random time, port 8161 became unresponsive

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • AMQ 7.13.2.GA
    • console, monitoring
    • Critical

      Broker 7.13.2 running in Openshift, replica=1, persistence, console and metrics with SSL.
      After a random amount of time since the startup, the port 8161, used to collect Prometheus metrics stop to respond. The port is open and match the readiness. But if called, with `curl` for example, it hang indefinably. No logs and no issue on the broker behavior. Just 8161 unresponsive. Restart fix the issue for a while then it occurs again. It happen on different broker instances with quite the same setup.

      Heapdump reports two Jetty thread for `_ArtemisPrometheusMetricsPluginServlet` blocked on writing Thread.

      The problem appears to stem from a Jetty upgrade between versions 7.13.0 (Jetty 12.0.15) and 7.13.2 (Jetty 12.1.1), which was done to address some CVEs. This new version includes significant changes, particularly regarding HTTP/2.

      Back on the heap dump and subsequent thread dump [1] I found that it revealed two threads handling ArtemisPrometheusMetricsPluginServlet that seemed blocked during a write operation `.SharedBlockingCallback$Blocker.block()`. These threads were apparently processing very large payloads for Prometheus (over 80Mb).

      Based on these symptoms, I found a known (regression in Jetty) https://github.com/jetty/jetty.project/issues/13567 that is highly likely to be the cause. This issue exhibits the same characteristics in the same version and has been fixed in Jetty 12.1.2.

      Thread 0x733bb7500
        at jdk.internal.misc.Unsafe.park(ZJ)V (Unsafe.java(Native Method))
        at java.util.concurrent.locks.LockSupport.park()V (LockSupport.java:341)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block()Z (AbstractQueuedSynchronizer.java:506)
        at java.util.concurrent.ForkJoinPool.unmanagedBlock(Ljava/util/concurrent/ForkJoinPool$ManagedBlocker;)V (ForkJoinPool.java:3465)
        at java.util.concurrent.ForkJoinPool.managedBlock(Ljava/util/concurrent/ForkJoinPool$ManagedBlocker;)V (ForkJoinPool.java:3436)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()V (AbstractQueuedSynchronizer.java:1630)
        at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block()V (SharedBlockingCallback.java:214)
        at org.eclipse.jetty.ee9.nested.HttpOutput.channelWrite(Ljava/nio/ByteBuffer;Z)V (HttpOutput.java:277)
        at org.eclipse.jetty.ee9.nested.HttpOutput.write([BII)V (HttpOutput.java:892)
        at java.io.ByteArrayOutputStream.writeTo(Ljava/io/OutputStream;)V (ByteArrayOutputStream.java:161)
        at org.eclipse.jetty.io.WriteThroughWriter$Iso88591Writer.append(Ljava/lang/CharSequence;)Lorg/eclipse/jetty/io/WriteThroughWriter; (WriteThroughWriter.java:187)
        at org.eclipse.jetty.io.WriteThroughWriter.write(Ljava/lang/String;II)V (WriteThroughWriter.java:130)
        at org.eclipse.jetty.ee9.nested.ResponseWriter.write(Ljava/lang/String;II)V (ResponseWriter.java:236)
        at org.eclipse.jetty.ee9.nested.ResponseWriter.write(Ljava/lang/String;)V (ResponseWriter.java:254)
        at com.redhat.amq.broker.core.server.metrics.plugins.ArtemisPrometheusMetricsPluginServlet.doGet(Ljakarta/servlet/http/HttpServletRequest;Ljakarta/servlet/http/HttpServletResponse;)V (ArtemisPrometheusMetricsPluginServlet.java:74)
       ...
      

              dbruscin Domenico Francesco Bruscino
              rhn-support-agagliar Antonio Gagliardi
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: