Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-19689

Micrometer extension keeps pushing metrics after removal and reload (was "Failed to publish metrics to OTLP receiver" when running the testsuite)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • 35.0.0.Final
    • 35.0.0.Beta1
    • Micrometer
    • None
    • Hide
      • Start a WildFly 35 Beta server from ZIP distro (standalone-microprofile.xml)
      • Connect via CLI and remove opentelemetry and microprofile-telemetry extensions and subsystems.
      • Start an OTel collector
      • Add the Micrometer extension and subsystem, and configure it to export to the OTel collector.

      WildFly starts pushing metrics, then:

      • Remove Micrometer extension/subsystem, and reload.
      • Stop the collector.

      WildFly keeps pushing metrics, until its process is stopped.

      So, metrics are being exported by micrometer after this has been removed completely, despite a reload.

      Show
      Start a WildFly 35 Beta server from ZIP distro (standalone-microprofile.xml) Connect via CLI and remove opentelemetry and microprofile-telemetry extensions and subsystems. Start an OTel collector Add the Micrometer extension and subsystem, and configure it to export to the OTel collector. WildFly starts pushing metrics, then: Remove Micrometer extension/subsystem, and reload. Stop the collector. WildFly keeps pushing metrics, until its process is stopped. So, metrics are being exported by micrometer after this has been removed completely, despite a reload.
    • ---
    • ---

      Update 2024-12-06 (fburzigo):

      This is not a just testsuite issue as documented below.
      The org.wildfly.extension.micrometer extension seems to survive removal, and subsequent server reload, resulting in WildFly still pushing metrics to the collector.
      When it is not available, as it might happen temporarily when runing the TS, then the log messages reported below are visible in the server logs.

      This affects current WildFly 35 Beta snapshots, and the priority is being increased to blocker, since it is not just a test issue.

      When running the testsuite/integration/microprofile tests, after having run the first few tests the log is filled with loads of the following messages:

      2024-08-30 15:59:16,188 WARNING [io.micrometer.registry.otlp.OtlpMeterRegistry] (otlp-metrics-publisher) Failed to publish metrics to OTLP receiver: java.net.ConnectException: Connection refused
      	at java.base/sun.nio.ch.Net.pollConnect(Native Method)
      	at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
      	at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:549)
      	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
      	at java.base/java.net.Socket.connect(Socket.java:633)
      	at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:178)
      	at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:532)
      	at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:637)
      	at java.base/sun.net.www.http.HttpClient.<init>(HttpClient.java:280)
      	at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:385)
      	at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:407)
      	at java.base/sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1309)
      	at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1242)
      	at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1128)
      	at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1057)
      	at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1430)
      	at java.base/sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1401)
      	at io.micrometer@1.12.4//io.micrometer.core.ipc.http.HttpUrlConnectionSender.send(HttpUrlConnectionSender.java:98)
      	at io.micrometer@1.12.4//io.micrometer.core.ipc.http.HttpSender$Request$Builder.send(HttpSender.java:306)
      	at io.micrometer@1.12.4//io.micrometer.registry.otlp.OtlpMeterRegistry.publish(OtlpMeterRegistry.java:168)
      	at io.micrometer@1.12.4//io.micrometer.core.instrument.push.PushMeterRegistry.publishSafelyOrSkipIfInProgress(PushMeterRegistry.java:64)
      	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
      	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
      	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      	at java.base/java.lang.Thread.run(Thread.java:833)
      

      it starts happening somewhere halfway through these, in case you have a different test order:

      [INFO] Running org.wildfly.test.integration.observability.opentelemetry.BasicOpenTelemetryTestCase
      [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.075 s - in org.wildfly.test.integration.observability.opentelemetry.BasicOpenTelemetryTestCase
      [INFO] Running org.wildfly.test.integration.observability.opentelemetry.ContextPropagationTestCase
      [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.736 s - in org.wildfly.test.integration.observability.opentelemetry.ContextPropagationTestCase
      [INFO] Running org.wildfly.test.integration.observability.opentelemetry.OpenTelemetryIntegrationTestCase
      [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.508 s - in org.wildfly.test.integration.observability.opentelemetry.OpenTelemetryIntegrationTestCase
      [INFO] Running org.wildfly.test.integration.observability.micrometer.multiple.MultipleWarTestCase
      [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.154 s - in org.wildfly.test.integration.observability.micrometer.multiple.MultipleWarTestCase
      [INFO] Running org.wildfly.test.integration.observability.micrometer.multiple.EarDeploymentTestCase
      [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.061 s - in org.wildfly.test.integration.observability.micrometer.multiple.EarDeploymentTestCase
      [INFO] Running org.wildfly.test.integration.observability.micrometer.MicrometerOtelIntegrationTestCase
      [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.323 s - in org.wildfly.test.integration.observability.micrometer.MicrometerOtelIntegrationTestCase
      [INFO] Running org.wildfly.test.integration.observability.micrometer.BasicMicrometerTestCase
      [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.077 s - in org.wildfly.test.integration.observability.micrometer.BasicMicrometerTestCase
      [INFO] Running org.wildfly.test.integration.microprofile.jwt.smoke.JWTSmokeTestCase
      

      If I run just some of the later tests I don't see these messages, so I guess Otel is not cleaning up something.

       mvn clean install -pl testsuite/integration/microprofile -DallTests -Dtest=JWTSmokeTestCase,.ClockSkewTest,ReactiveMessagingKafkaUserApiTestCase,ReactiveMessagingKafkaTestCase,AnonymousAmqpTestCase
      

              jaslee@redhat.com Jason Lee
              kkhan1@redhat.com Kabir Khan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: