Uploaded image for project: 'Red Hat Fuse'
  1. Red Hat Fuse
  2. ENTESB-9451

Fabric8 gateway does not close client connections even when the target container has crashed

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • jboss-fuse-6.3
    • Fabric8 v1
    • None
    • % %
    • Hide

      1. Set up three child containers – one with gateway-http, and two with a CXF endpoint (e.g., the REST quickstart example).
      2. Making HTTP requests on the gateway port for the CXF endpoints using, e.g., curl
      3. Note from the container logs that the client requests are directed alternately to the service containers
      4. Execute kill -9 on one of the service containers
      5. Attempt to make multiple requests on the gateway from the HTTP client
      6. Note that half of these requests appear to hang – the gateway never disconnects the clients, even after it knows that the target container is down. So new requests might be serviced correctly, while old requests cannot be re-tried until the client itself times out

      Show
      1. Set up three child containers – one with gateway-http, and two with a CXF endpoint (e.g., the REST quickstart example). 2. Making HTTP requests on the gateway port for the CXF endpoints using, e.g., curl 3. Note from the container logs that the client requests are directed alternately to the service containers 4. Execute kill -9 on one of the service containers 5. Attempt to make multiple requests on the gateway from the HTTP client 6. Note that half of these requests appear to hang – the gateway never disconnects the clients, even after it knows that the target container is down. So new requests might be serviced correctly, while old requests cannot be re-tried until the client itself times out

      When HTTP clients use the Fabric8 gateway for sharing requests for HTTP services among containers, it takes a surprisingly long time for a client of the gateway to respond to a situation where one of the service containers fails.

      The first problem, which might be unavoidable, is the time it takes for the gateway to realise that one of its target containers has crashed. It responds a lot more quickly to a orderly shut-down because, I presume, the container that is shutting down is able to close its ZK session properly. However, responding to a crash relies, I think, on ZK heartbeats and timeouts.

      The second problem, though, is that the gateway maintains a client connection even after it has learned that the container it is trying to communicate on behalf of that client with is no longer available. At a time at which new requests on the gateway might be handled correctly, by being directed to one of the surviving containers, old requests are stuck, waiting for the client to time out.

      When the gateway realizes that one of its service endpoints is no longer available, for whatever reason, any client connections to the gateway ought to be closed immediately, so that clients can respond to a failure more promptly.

              atarocch@redhat.com Andrea Tarocchi
              rhn-support-kboone Kevin Boone
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: