Uploaded image for project: 'Red Hat Data Grid'
  1. Red Hat Data Grid
  2. JDG-3967

Cluster in a confusing state after restarted from graceful shutdown - no hint for waiting on complete restarted

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • RHDG 8.4.2 GA
    • RHDG 8.1 GA, RHDG 8.0.1 GA
    • Clustering
    • None
    • Documentation (Ref Guide, User Guide, etc.), User Experience
    • Impediment
    • Hide

      Start a cluster of (4) nodes
      add entries

      use CLI "shutdown cluster"

      start only one node and try to read entries

      The following log will be shown:

      ERROR  [org.infinispan.interceptors.impl.InvocationContextInterceptor] ISPN000136: Error executing command PrepareCommand on Cache '___protobuf_metadata', writing keys [message.proto, .errors, message.proto.errors] java.lang.IllegalArgumentException: Command does not have a topology id
      ...
       
       

      Show
      Start a cluster of (4) nodes add entries use CLI "shutdown cluster" start only one node and try to read entries The following log will be shown: ERROR  [org.infinispan.interceptors.impl.InvocationContextInterceptor] ISPN000136: Error executing command PrepareCommand on Cache '___protobuf_metadata', writing keys [message.proto, .errors, message.proto.errors] java.lang.IllegalArgumentException: Command does not have a topology id ...    

    Description

      After a cluster is stopped with "shutdown cluster" and incomplete restart there is no WARN or INFO message that the cluster is in an incomplete state if not all nodes are back.

      If there is a single node started it is still possible to add new entries!!
      As well as entries can be read.
      But the server will throw Exceptions.

      The expectation is to have log messages with a statement that the cluster of (a,b, ...)  is incomplete started after graceful shutdown and the missing nodes are (x,y,...)

      It should not be possible to access caches.

      There should be a CLI/JMX option to interrupt the graceful start and set the cluster to a working state - even if there is a possible loss of data in this case.
       
       

      Attachments

        Issue Links

          Activity

            People

              rh-ee-jbolina Jose Bolina
              rhn-support-wfink Wolf Fink
              Pavel Drobek Pavel Drobek
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: