Uploaded image for project: 'mod_cluster'
  1. mod_cluster
  2. MODCLUSTER-369

httpd should remove lost node/worker

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Minor
    • Resolution: Unresolved
    • 1.2.0.Final, 1.2.6.Final
    • None
    • Native (httpd modules)
    • None

    Description

      supposing this env running on amazon:

      • one virtual machine running HTTPD (let's call it FE)
      • two virtual machines running tomcat (let's call them NODE01 and NODE02)

      If NODE02 virtual machine dies or crash for any reason, HTTPD still report it in mod_cluster-manager sometime view with status ok and sometime with status notok.
      Moreover some requests are still sent to the dead node.

      Please note i replicated this behavior just closing netwrok traffic using iptables on NODE02:
      iptables -A OUTPUT -p tcp -m state --state NEW,ESTABLISHED -m tcp --dport 6666 -j DROP
      iptables -A INPUT -m state --state NEW,ESTABLISHED -p tcp --dport 8009 -j DROP

      In this way tomcat instance is not able to send its status and HTTPD is not able to sends traffic on port 8009.

      In my opinion two issues here:
      1. if HTTPD doesn't recevice any STATUS from a worker, the worker mustb e at least marked as NOTOK
      2. it seems that HTTPD considers the worker available during its retry policy and this causes that some requests are still forwarded to that node and it causes the worker appears as flapping.

      Original threads:https://community.jboss.org/thread/234235

      Marking this is as critical since amazon is one of the main player in virtualization and if on a production environment an instance disappears (it can happen in any moment) the whole production environemnet is affected (since some requests are still send to the dead node).

      In attahcment httpd error log file produced using modcluster 1.2.6.

      at 08:32:30 HTTPD has been restarted (with two wrokers OK)
      at 08:33:10 instance NODE02 (IP: 10.2.2.2) disappears
      at 08:33:25 NODE2 is marked as NOTOK
      at 08:33:32 NODE02 is marked as OK
      at 08:33:41 NODE2 is marked as NOTOK
      ...and so on...

      Attachments

        Activity

          People

            rhn-engineering-jclere Jean-Frederic Clere
            nichele_jira Stefano Nichele (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: