Uploaded image for project: 'WildFly Core'
  1. WildFly Core
  2. WFCORE-4226

Propagation of exit code to Windows Service Control Manager

    Details

    • Type: Enhancement
    • Status: Pull Request Sent (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Scripts
    • Labels:
      None
    • Affects:
      Documentation (Ref Guide, User Guide, etc.), Release Notes, Migration, Compatibility/Configuration
    • Release Notes Text:
      Hide
      If pull request #3293 (https://github.com/wildfly/wildfly-core/pull/3293) is taken as is then existing Windows service (installed before this change) needs to be reinstalled because start parameter of Procrun is modified (additional environment variable is added - ISSERVICE) in new service.bat and ISSERVICE is required for correct transformation of exit code by modified domain.bat and standalone.bat. The same is true for service failure actions flag which is turned on by new service.bat when Windows service is installed (i.e. this flag is not installed by old service.bat but is required for service recovery actions to take place).
      Show
      If pull request #3293 ( https://github.com/wildfly/wildfly-core/pull/3293 ) is taken as is then existing Windows service (installed before this change) needs to be reinstalled because start parameter of Procrun is modified (additional environment variable is added - ISSERVICE) in new service.bat and ISSERVICE is required for correct transformation of exit code by modified domain.bat and standalone.bat. The same is true for service failure actions flag which is turned on by new service.bat when Windows service is installed (i.e. this flag is not installed by old service.bat but is required for service recovery actions to take place).

      Description

      I use JBoss / WildFly as Windows service (wildfly-service.exe, aka prunsrv.exe, aka Apache Commons Daemon Procrun) and face multiple issues which prevents Windows service recovery actions to work as expected for my WildFly / JBoss Windows service - Windows Service Control Manager (SCM) just doesn't understand that WildFly Windows service failed if my WildFly crashes (for example, in case of OOM and -XX:+CrashOnOutOfMemoryError JVM option), and SCM doesn't execute recovery actions at all.

      Below is the list of issues I found being the root cause:

      1. WildFly (JBoss) launch scripts (domain.bat and standalone.bat) don't return exit code of JVM to the caller sometimes (depends on the way scripts are launched). They should explicitly use
        exit /B my_exit_code
        

        to return exit code to the caller always.

      2. Procrun (wildfly-service.exe) reports about stopped state of Windows service even if JVM process stops with non zero exit code (but this exit code is still returned to SCM).
      3. Procrun and WildFly service.bat script don't turn on failure actions flag for the Windows service which is installed by their means. Because of this flag is turned off SCM doesn't treat the case when stopped state is reported with non zero exit code as service failure.

      I suggest to:

      1. Change service.bat - add turning on of failure actions flag for the service installed by Procrun.
      2. Change service.bat - add additional flag (environment variable) to indicate that WildFly is running as Windows service. This flag is needed for transformation of exit code - we cannot use exit codes 1..15999 because of Procrun doesn't define its own error messages and Windows Service Control Manager (SCM) treats exit code reported by Procrun as standard Windows System Error Code, so we need to modify exit code reported by Procrun (exit code of WildFly launch script) to make it not interleaving with existing Windows System Error Codes.
      3. Change standalone.bat and domain.bat (PowerShell scripts are not used for Windows services) to explicitly return non zero exit code in case of errors. This error code should be adjusted if WildFly runs as Windows service (refer to additional flag introduced as Procrun start parameter in service.bat and described above).

      Refer to pull request #3293 at wildfly/wildfly-core GitHub project.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                mabrarov Marat Abrarov
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: