Uploaded image for project: 'Undertow'
  1. Undertow
  2. UNDERTOW-708

Undertow mod_cluster proxy: Intermittent HTTP 500 on jvmKill based failover among worker nodes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Major
    • None
    • 1.3.21.Final, 1.3.22.Final
    • Proxy
    • None
    • Hide

      1 balancer, 3 workers, 1 client, 1 web app, 1 worker is killed

      Show
      1 balancer, 3 workers, 1 client, 1 web app, 1 worker is killed

    Description

      HTTP 500

      3 workers up (see Section Configuration), one is killed with kill -9:
      Server jboss-eap-7.0-3 owning N42b8ctiS7ViVxm2eaQm1lPXWZ9l8pLuT-7_6xCx.jboss-eap-7.0-3 killed:

      10:44:38.483 --- Server jboss-eap-7.0-3 killed, RIP ---
      

      Request is made to the balancer:

      10:44:38.488 Verifying URL: http://192.168.122.172:8484/clusterbench/requestinfo/ for response code 200 and content to: contain ""
      

      Balancer's reaction:

      10:44:38,493 DEBUG [io.undertow.server.handlers.proxy] (default I/O-1) Sending request ClientRequest{path='/clusterbench/requestinfo/', method=GET, protocol=HTTP/1.1} to target 192.168.122.172 for exchange HttpServerExchange{ GET /clusterbench/requestinfo/ request {Accept=[image/gif, image/jpeg, image/pjpeg, image/pjpeg, */*], Connection=[Keep-Alive], Accept-Language=[en-us], Accept-Encoding=[gzip, deflate], Cookie=[JSESSIONID=N42b8ctiS7ViVxm2eaQm1lPXWZ9l8pLuT-7_6xCx.jboss-eap-7.0-3], User-Agent=[Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)], Host=[192.168.122.172:8484]} response {X-Powered-By=[Undertow/1], Server=[JBoss-EAP/7]}}
      10:44:38,494 DEBUG [io.undertow.server.handlers.proxy] (default I/O-1) Sent request ClientRequest{path='/clusterbench/requestinfo/', method=GET, protocol=HTTP/1.1} to target 192.168.122.172 for exchange HttpServerExchange{ GET /clusterbench/requestinfo/ request {Accept=[image/gif, image/jpeg, image/pjpeg, image/pjpeg, */*], Connection=[Keep-Alive], Accept-Language=[en-us], Accept-Encoding=[gzip, deflate], Cookie=[JSESSIONID=N42b8ctiS7ViVxm2eaQm1lPXWZ9l8pLuT-7_6xCx.jboss-eap-7.0-3], User-Agent=[Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)], Host=[192.168.122.172:8484]} response {X-Powered-By=[Undertow/1], Server=[JBoss-EAP/7]}}
      10:44:38,494 DEBUG [io.undertow.request] (default I/O-1) suspending writes on io.undertow.protocols.ajp.AjpClientRequestClientStreamSinkChannel@338888f9 to prevent listener runaway
      10:44:38,514 DEBUG [io.undertow.request.io] (default I/O-1) UT005013: An IOException occurred: java.io.IOException: java.io.IOException: Connection reset by peer
          at io.undertow.protocols.ajp.AjpClientChannel.handleBrokenSourceChannel(AjpClientChannel.java:156)
          at io.undertow.server.protocol.framed.AbstractFramedChannel.markReadsBroken(AbstractFramedChannel.java:806)
          at io.undertow.server.protocol.framed.AbstractFramedChannel.receive(AbstractFramedChannel.java:471)
          at io.undertow.client.ajp.AjpClientConnection$ClientReceiveListener.handleEvent(AjpClientConnection.java:316)
          at io.undertow.client.ajp.AjpClientConnection$ClientReceiveListener.handleEvent(AjpClientConnection.java:312)
          at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
          at io.undertow.server.protocol.framed.AbstractFramedChannel$FrameReadListener.handleEvent(AbstractFramedChannel.java:909)
          at io.undertow.server.protocol.framed.AbstractFramedChannel$FrameReadListener.handleEvent(AbstractFramedChannel.java:890)
          at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
          at org.xnio.conduits.ReadReadyHandler$ChannelListenerHandler.readReady(ReadReadyHandler.java:66)
          at org.xnio.nio.NioSocketConduit.handleReady(NioSocketConduit.java:88)
          at org.xnio.nio.WorkerThread.run(WorkerThread.java:559)
      Caused by: java.io.IOException: Connection reset by peer
          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
          at sun.nio.ch.IOUtil.read(IOUtil.java:192)
          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
          at org.xnio.nio.NioSocketConduit.read(NioSocketConduit.java:286)
          at io.undertow.conduits.IdleTimeoutConduit.read(IdleTimeoutConduit.java:202)
          at org.xnio.conduits.ConduitStreamSourceChannel.read(ConduitStreamSourceChannel.java:127)
          at io.undertow.server.protocol.framed.AbstractFramedChannel.receive(AbstractFramedChannel.java:365)
          ... 9 more
      
      10:44:38,515 ERROR [io.undertow.client] (default I/O-1) UT005001: An exception occurred processing the request: java.io.IOException: Connection reset by peer
          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
          at sun.nio.ch.IOUtil.read(IOUtil.java:192)
          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
          at org.xnio.nio.NioSocketConduit.read(NioSocketConduit.java:286)
          at io.undertow.conduits.IdleTimeoutConduit.read(IdleTimeoutConduit.java:202)
          at org.xnio.conduits.ConduitStreamSourceChannel.read(ConduitStreamSourceChannel.java:127)
          at io.undertow.server.protocol.framed.AbstractFramedChannel.receive(AbstractFramedChannel.java:365)
          at io.undertow.client.ajp.AjpClientConnection$ClientReceiveListener.handleEvent(AjpClientConnection.java:316)
          at io.undertow.client.ajp.AjpClientConnection$ClientReceiveListener.handleEvent(AjpClientConnection.java:312)
          at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
          at io.undertow.server.protocol.framed.AbstractFramedChannel$FrameReadListener.handleEvent(AbstractFramedChannel.java:909)
          at io.undertow.server.protocol.framed.AbstractFramedChannel$FrameReadListener.handleEvent(AbstractFramedChannel.java:890)
          at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
          at org.xnio.conduits.ReadReadyHandler$ChannelListenerHandler.readReady(ReadReadyHandler.java:66)
          at org.xnio.nio.NioSocketConduit.handleReady(NioSocketConduit.java:88)
          at org.xnio.nio.WorkerThread.run(WorkerThread.java:559)
      
      10:44:38,515 ERROR [io.undertow.proxy] (default I/O-1) UT005028: Proxy request to /clusterbench/requestinfo/ failed: java.io.IOException: Connection reset by peer
          at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
          at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
          at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
          at sun.nio.ch.IOUtil.read(IOUtil.java:192)
          at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
          at org.xnio.nio.NioSocketConduit.read(NioSocketConduit.java:286)
          at io.undertow.conduits.IdleTimeoutConduit.read(IdleTimeoutConduit.java:202)
          at org.xnio.conduits.ConduitStreamSourceChannel.read(ConduitStreamSourceChannel.java:127)
          at io.undertow.server.protocol.framed.AbstractFramedChannel.receive(AbstractFramedChannel.java:365)
          at io.undertow.client.ajp.AjpClientConnection$ClientReceiveListener.handleEvent(AjpClientConnection.java:316)
          at io.undertow.client.ajp.AjpClientConnection$ClientReceiveListener.handleEvent(AjpClientConnection.java:312)
          at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
          at io.undertow.server.protocol.framed.AbstractFramedChannel$FrameReadListener.handleEvent(AbstractFramedChannel.java:909)
          at io.undertow.server.protocol.framed.AbstractFramedChannel$FrameReadListener.handleEvent(AbstractFramedChannel.java:890)
          at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
          at org.xnio.conduits.ReadReadyHandler$ChannelListenerHandler.readReady(ReadReadyHandler.java:66)
          at org.xnio.nio.NioSocketConduit.handleReady(NioSocketConduit.java:88)
          at org.xnio.nio.WorkerThread.run(WorkerThread.java:559)
      

      Client receives HTTP 500:

      INFO: <html><head><title>Error</title></head><body>500 - Internal Server Error</body></html>
      10:44:38.528 -----------------------------------------------------------------
      10:44:38.528 Checking for http://192.168.122.172:8484/clusterbench/requestinfo/ response code: was 500, expected 200
      

      Balancer configuration

      {
          "outcome" => "success",
          "result" => {
              "advertise-frequency" => 10000,
              "advertise-path" => "/",
              "advertise-protocol" => "http",
              "advertise-socket-binding" => "modcluster-adv",
              "broken-node-timeout" => 60000,
              "cached-connections-per-thread" => 40,
              "connection-idle-timeout" => 60,
              "connections-per-thread" => 40,
              "enable-http2" => false,
              "health-check-interval" => 10000,
              "http2-enable-push" => true,
              "http2-header-table-size" => undefined,
              "http2-initial-window-size" => undefined,
              "http2-max-concurrent-streams" => undefined,
              "http2-max-frame-size" => undefined,
              "http2-max-header-list-size" => undefined,
              "management-access-predicate" => undefined,
              "management-socket-binding" => "http",
              "max-ajp-packet-size" => undefined,
              "max-request-time" => -1,
              "request-queue-size" => 1000,
              "security-key" => undefined,
              "security-realm" => undefined,
              "use-alias" => false,
              "worker" => "default",
              "balancer" => {"mycluster" => {
                  "max-attempts" => 1,
                  "sticky-session" => true,
                  "sticky-session-cookie" => "JSESSIONID",
                  "sticky-session-force" => false,
                  "sticky-session-path" => undefined,
                  "sticky-session-remove" => false,
                  "wait-worker" => 0,
                  "load-balancing-group" => undefined,
                  "node" => {
                      "jboss-eap-7.0-3" => {
                          "aliases" => [
                              "default-host",
                              "localhost"
                          ],
                          "cache-connections" => 40,
                          "elected" => 0,
                          "flush-packets" => false,
                          "load" => 1,
                          "load-balancing-group" => undefined,
                          "max-connections" => 40,
                          "open-connections" => 1,
                          "ping" => 10,
                          "queue-new-requests" => true,
                          "read" => 0L,
                          "request-queue-size" => 1000,
                          "status" => "NODE_UP",
                          "timeout" => 0,
                          "ttl" => 60L,
                          "uri" => "ajp://192.168.122.172:8211/?#",
                          "written" => 0L,
                          "context" => {"/clusterbench" => {
                              "requests" => 0,
                              "status" => "enabled"
                          }}
                      },
                      "jboss-eap-7.0-2" => {
                          "aliases" => [
                              "default-host",
                              "localhost"
                          ],
                          "cache-connections" => 40,
                          "elected" => 0,
                          "flush-packets" => false,
                          "load" => 1,
                          "load-balancing-group" => undefined,
                          "max-connections" => 40,
                          "open-connections" => 1,
                          "ping" => 10,
                          "queue-new-requests" => true,
                          "read" => 0L,
                          "request-queue-size" => 1000,
                          "status" => "NODE_UP",
                          "timeout" => 0,
                          "ttl" => 60L,
                          "uri" => "ajp://192.168.122.172:8110/?#",
                          "written" => 0L,
                          "context" => {"/clusterbench" => {
                              "requests" => 0,
                              "status" => "enabled"
                          }}
                      },
                      "jboss-eap-7.0-1" => {
                          "aliases" => [
                              "default-host",
                              "localhost"
                          ],
                          "cache-connections" => 40,
                          "elected" => 0,
                          "flush-packets" => false,
                          "load" => 1,
                          "load-balancing-group" => undefined,
                          "max-connections" => 40,
                          "open-connections" => 1,
                          "ping" => 10,
                          "queue-new-requests" => true,
                          "read" => 0L,
                          "request-queue-size" => 1000,
                          "status" => "NODE_UP",
                          "timeout" => 0,
                          "ttl" => 60L,
                          "uri" => "ajp://192.168.122.172:8009/?#",
                          "written" => 0L,
                          "context" => {"/clusterbench" => {
                              "requests" => 0,
                              "status" => "enabled"
                          }}
                      }
                  }
              }}
          }
      }
      

      How come we didn't catch it earlier?

      The test was O.K. on other systems, even former HTTP 503 in JBEAP-4086 was verified as fixed in CR2. This particular RHEL vm displays the error on ~ each run. Setup is under investigation.

      Attachments

        Issue Links

          Activity

            People

              sdouglas1@redhat.com Stuart Douglas
              mbabacek1@redhat.com Michal Karm
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: