Uploaded image for project: 'WildFly Core'
  1. WildFly Core
  2. WFCORE-6739

Add ServerActivity ordering to SuspendController execution

XMLWordPrintable

      WFLY-17742 introduces the need for some sort of ordering semantic to graceful shutdown.

      Basically the transaction subsystem has a ServerActivity used to suspend the recovery manager. The recovery manager needs to suspend before the general shutdown of MSC services begins, as tx recovery may need to utilize services. Hence it happens as part of server suspend. But, for WFLY-17742, when the recovery manager suspends it needs to turn off the ability to create new transactions. If it does this before other ServerActivities have completed, the inability to create new transactions may break in-flight requests, so suspend is no longer 'graceful'. So, we need tx recovery to be able to suspend after other activities, but before MSC shutdown begins.

      There are various ways we might approach this:

      1) One is to add a new method to ServerActivity (say, 'postSuspend' until we think of something better), and that method gets called after the existing 'suspend' is called. The tx recovery work is moved to this new method, so it happens 'after'.

      2) Another is to add a method to ServerActivity that returns some kind of ordering or priority value (say an int or an enum) and SuspendController organizes its calls to the registered activities based on those values. This is a more flexible mechanism than adding a single new 'phase' via a 'postSuspend' method. (The fact that 'postSuspend' is not a good method name and I'm not easily thinking of a better one is a sign to me that a more flexible approach is appropriate.)

      3) A third possibility is to develop some kind of dependency mechanism between ServerActivity instances. This is more technically valid, but is more fragile, as it can break if someone adds a new ServerActivity and forgets to record dependencies. It's also much harder to implement. So I don't think we should pursue this unless we decide the previous two approaches are not good options. The other approaches can break as well if people don't think correctly about when to do their suspend work, but after years of graceful shutdown experience so far we've only identified this one ordering issue, which would only need to be handled by one ServerActivity.

      (Note that there is no 'natural' dependency relationship between the existing ServerActivity instances; e.g. there's no reliable MSC service dependency relationship between the services that register them.)

      See https://github.com/wildfly/wildfly-proposals/pull/520 for more details.

            bstansbe@redhat.com Brian Stansberry
            bstansbe@redhat.com Brian Stansberry
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: