-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
AMQ 7.13.2.GA
Broker 7.13.2 running in Openshift, replica=1, persistence, console and metrics with SSL.
After a random amount of time since the startup, the port 8161, used to collect Prometheus metrics stop to respond. The port is open and match the readiness. But if called, with `curl` for example, it hang indefinably. No logs and no issue on the broker behavior. Just 8161 unresponsive. Restart fix the issue for a while then it occurs again. It happen on different broker instances with quite the same setup.
Heapdump reports two Jetty thread for `_ArtemisPrometheusMetricsPluginServlet` blocked on writing Thread.
The problem appears to stem from a Jetty upgrade between versions 7.13.0 (Jetty 12.0.15) and 7.13.2 (Jetty 12.1.1), which was done to address some CVEs. This new version includes significant changes, particularly regarding HTTP/2.
Back on the heap dump and subsequent thread dump [1] I found that it revealed two threads handling ArtemisPrometheusMetricsPluginServlet that seemed blocked during a write operation `.SharedBlockingCallback$Blocker.block()`. These threads were apparently processing very large payloads for Prometheus (over 80Mb).
Based on these symptoms, I found a known (regression in Jetty) https://github.com/jetty/jetty.project/issues/13567 that is highly likely to be the cause. This issue exhibits the same characteristics in the same version and has been fixed in Jetty 12.1.2.
Thread 0x733bb7500 at jdk.internal.misc.Unsafe.park(ZJ)V (Unsafe.java(Native Method)) at java.util.concurrent.locks.LockSupport.park()V (LockSupport.java:341) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block()Z (AbstractQueuedSynchronizer.java:506) at java.util.concurrent.ForkJoinPool.unmanagedBlock(Ljava/util/concurrent/ForkJoinPool$ManagedBlocker;)V (ForkJoinPool.java:3465) at java.util.concurrent.ForkJoinPool.managedBlock(Ljava/util/concurrent/ForkJoinPool$ManagedBlocker;)V (ForkJoinPool.java:3436) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()V (AbstractQueuedSynchronizer.java:1630) at org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block()V (SharedBlockingCallback.java:214) at org.eclipse.jetty.ee9.nested.HttpOutput.channelWrite(Ljava/nio/ByteBuffer;Z)V (HttpOutput.java:277) at org.eclipse.jetty.ee9.nested.HttpOutput.write([BII)V (HttpOutput.java:892) at java.io.ByteArrayOutputStream.writeTo(Ljava/io/OutputStream;)V (ByteArrayOutputStream.java:161) at org.eclipse.jetty.io.WriteThroughWriter$Iso88591Writer.append(Ljava/lang/CharSequence;)Lorg/eclipse/jetty/io/WriteThroughWriter; (WriteThroughWriter.java:187) at org.eclipse.jetty.io.WriteThroughWriter.write(Ljava/lang/String;II)V (WriteThroughWriter.java:130) at org.eclipse.jetty.ee9.nested.ResponseWriter.write(Ljava/lang/String;II)V (ResponseWriter.java:236) at org.eclipse.jetty.ee9.nested.ResponseWriter.write(Ljava/lang/String;)V (ResponseWriter.java:254) at com.redhat.amq.broker.core.server.metrics.plugins.ArtemisPrometheusMetricsPluginServlet.doGet(Ljakarta/servlet/http/HttpServletRequest;Ljakarta/servlet/http/HttpServletResponse;)V (ArtemisPrometheusMetricsPluginServlet.java:74) ...
- relates to
-
ENTMQBR-10210 Prometheus metrics exporter endpoint randomly stop responding
-
- Backlog
-