It looks like there may be an issue with the Batch jobs being restarted when server is suspended and then resumed again. This happens when this attribute is set:
/subsystem=batch-jberet:write-attribute(name=restart-jobs-on-resume,value=true)
Following error can be found in the log:
09:07:30,017 ERROR [org.wildfly.extension.batch] (management-handler-thread - 1) WFLYBATCH000016: Failed to restart execution 1 for job records-batchlet on deployment batch-suspend.jar: org.wildfly.security.authz.AuthorizationFailureException: ELY01088: Attempting to run as "$local" authorization operation failed at org.wildfly.security.elytron-base@2.1.0.Final//org.wildfly.security.auth.server.SecurityIdentity.createRunAsIdentity(SecurityIdentity.java:750) at org.wildfly.security.elytron-base@2.1.0.Final//org.wildfly.security.auth.server.SecurityIdentity.createRunAsIdentity(SecurityIdentity.java:725) at org.wildfly.extension.batch.jberet@28.0.0.Beta1//org.wildfly.extension.batch.jberet.deployment.JobOperatorService$BatchJobServerActivity.privilegedRunAs(JobOperatorService.java:568) at org.wildfly.extension.batch.jberet@28.0.0.Beta1//org.wildfly.extension.batch.jberet.deployment.JobOperatorService$BatchJobServerActivity.restartStoppedJobs(JobOperatorService.java:543) at org.wildfly.extension.batch.jberet@28.0.0.Beta1//org.wildfly.extension.batch.jberet.deployment.JobOperatorService$BatchJobServerActivity.resume(JobOperatorService.java:458) at org.jboss.as.server@20.0.0.Beta8//org.jboss.as.server.suspend.SuspendController.resume(SuspendController.java:128) at org.jboss.as.server@20.0.0.Beta8//org.jboss.as.server.suspend.SuspendController.resume(SuspendController.java:106) at org.jboss.as.server@20.0.0.Beta8//org.jboss.as.server.operations.ServerResumeHandler$1$1.handleResult(ServerResumeHandler.java:74) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext$Step.invokeResultHandler(AbstractOperationContext.java:1570) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext$Step.handleResult(AbstractOperationContext.java:1552) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext$Step.finalizeInternal(AbstractOperationContext.java:1509) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext$Step.finalizeStep(AbstractOperationContext.java:1482) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext.executeResultHandlerPhase(AbstractOperationContext.java:910) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext.executeDoneStage(AbstractOperationContext.java:896) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext.processStages(AbstractOperationContext.java:803) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:466) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.OperationContextImpl.executeOperation(OperationContextImpl.java:1431) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:448) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.ModelControllerImpl.lambda$executeForResponse$0(ModelControllerImpl.java:259) at org.wildfly.security.elytron-base@2.1.0.Final//org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:304) at org.wildfly.security.elytron-base@2.1.0.Final//org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:270) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.ModelControllerImpl.executeForResponse(ModelControllerImpl.java:259) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.ModelControllerImpl.executeOperation(ModelControllerImpl.java:253) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.ModelControllerImpl.execute(ModelControllerImpl.java:236) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler.doExecute(ModelControllerClientOperationHandler.java:241) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler$1$1.run(ModelControllerClientOperationHandler.java:163) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler$1$1.run(ModelControllerClientOperationHandler.java:159) at org.wildfly.security.elytron-base@2.1.0.Final//org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:328) at org.wildfly.security.elytron-base@2.1.0.Final//org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:285) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AccessAuditContext.doAs(AccessAuditContext.java:254) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.AccessAuditContext.doAs(AccessAuditContext.java:225) at org.jboss.as.controller@20.0.0.Beta8//org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler$1.execute(ModelControllerClientOperationHandler.java:159) at org.jboss.as.protocol@20.0.0.Beta8//org.jboss.as.protocol.mgmt.ManagementRequestContextImpl$1.doExecute(ManagementRequestContextImpl.java:70) at org.jboss.as.protocol@20.0.0.Beta8//org.jboss.as.protocol.mgmt.ManagementRequestContextImpl$AsyncTaskRunner.run(ManagementRequestContextImpl.java:160) at org.jboss.threads@2.4.0.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35) at org.jboss.threads@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1990) at org.jboss.threads@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486) at org.jboss.threads@2.4.0.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377) at java.base/java.lang.Thread.run(Thread.java:829) at org.jboss.threads@2.4.0.Final//org.jboss.threads.JBossThread.run(JBossThread.java:513) 2023-04-06 09:07:30.019 INFO o.j.q.s.s.StandaloneServerManager: Waiting for server to be running 2023-04-06 09:07:30.519 INFO o.w.e.c.c.o.OnlineManagementClient: Reconnecting the client 2023-04-06 09:07:30.578 INFO org.jboss.as.cli.CommandContext: Warning! The CLI is running in a non-modular environment and cannot load commands from management extensions. 2023-04-06 09:07:30.586 DEBUG o.w.e.c.c.o.OnlineManagementClient: Executing operation /:read-children-types 2023-04-06 09:07:30.589 DEBUG o.w.e.c.c.o.OnlineManagementClient: Executing operation /:read-attribute(name=server-state) 2023-04-06 09:07:30.591 DEBUG o.w.e.c.c.o.OnlineManagementClient: Executing operation /:read-attribute(name=server-state) 2023-04-06 09:07:30.593 DEBUG o.w.e.c.c.o.OnlineManagementClient: Executing operation /:read-attribute(name=suspend-state) 2023-04-06 09:07:30.595 INFO o.j.q.s.s.StandaloneServerManager: Current suspend state is: RUNNING
Based on the error, there is some issue with authorization but the Batch job was successfully started before server was suspended an resumed. Is it possible that some auth token or context was thrown away in the meantime?
Seems that this commit is the culprit (WFLY-16863), relevant PR.
I was also able to identify that this commit (WFLY-17156) is the culprit - before this commit it works as expected, after it (tested with this and WildFly 28.0.0.Beta1) aforementioned error is produced. TBH, based on the changes in that commit I don't really know how it may affect this behavior. I was thinking about something in the deployment processor changes there but I don't really know...
This issue is already present in the released wildfly-28.0.0.Beta1.zip.
- clones
-
WFLY-17853 Batch job fails to restart on server resume after server suspend
- Open