Uploaded image for project: 'Red Hat Data Grid'
  1. Red Hat Data Grid
  2. JDG-3967

Cluster in a confusing state after restarted from graceful shutdown - no hint for waiting on complete restarted

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • RHDG 8.4.2 GA
    • RHDG 8.1 GA, RHDG 8.0.1 GA
    • Clustering
    • None
    • Documentation (Ref Guide, User Guide, etc.), User Experience
    • Impediment
    • Hide

      Start a cluster of (4) nodes
      add entries

      use CLI "shutdown cluster"

      start only one node and try to read entries

      The following log will be shown:

      ERROR  [org.infinispan.interceptors.impl.InvocationContextInterceptor] ISPN000136: Error executing command PrepareCommand on Cache '___protobuf_metadata', writing keys [message.proto, .errors, message.proto.errors] java.lang.IllegalArgumentException: Command does not have a topology id
      ...
       
       

      Show
      Start a cluster of (4) nodes add entries use CLI "shutdown cluster" start only one node and try to read entries The following log will be shown: ERROR  [org.infinispan.interceptors.impl.InvocationContextInterceptor] ISPN000136: Error executing command PrepareCommand on Cache '___protobuf_metadata', writing keys [message.proto, .errors, message.proto.errors] java.lang.IllegalArgumentException: Command does not have a topology id ...    

      After a cluster is stopped with "shutdown cluster" and incomplete restart there is no WARN or INFO message that the cluster is in an incomplete state if not all nodes are back.

      If there is a single node started it is still possible to add new entries!!
      As well as entries can be read.
      But the server will throw Exceptions.

      The expectation is to have log messages with a statement that the cluster of (a,b, ...)  is incomplete started after graceful shutdown and the missing nodes are (x,y,...)

      It should not be possible to access caches.

      There should be a CLI/JMX option to interrupt the graceful start and set the cluster to a working state - even if there is a possible loss of data in this case.
       
       

            rh-ee-jbolina Jose Bolina
            rhn-support-wfink Wolf Fink
            Pavel Drobek Pavel Drobek (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: