While testing tx recovery in OpenShift I see that scale down of pod that has transaction in-doubt on it isn't successful
Scenario:
ejb client (app tx-client, pod tx-client-0):
- EJB business method
- lookup remote EJB
- enlist XA resource 1 to transaction
- enlist XA resource 2 to transaction
- call remote EJB
ejb server (app tx-server, pod tx-server-0):
- EJB business method
- enlist XA resource 1 to transaction
- enlist XA resource 2 to transaction
ejb server XA resource 2 fails with XAException(XAException.XAER_RMFAIL)
Then the test calls scale down (size from 1 to 0) on tx-server pod. But scale down never completes.
Log from operator:
{"level":"info","ts":1568905676.6303623,"logger":"controller_wildflyserver","msg":"Transaction recovery scaledown processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod Name":"tx-server-0","IP Address":"172.17.0.10"} {"level":"info","ts":1568905676.7313502,"logger":"controller_wildflyserver","msg":"Enabling recovery listener for processing scaledown at tx-server-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-server"} {"level":"info","ts":1568905679.4325309,"logger":"controller_wildflyserver","msg":"Query to find the transaction recovery port to force scan at pod tx-server-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-server"} {"level":"info","ts":1568905686.7914035,"logger":"controller_wildflyserver","msg":"Executing recovery scan at tx-server-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod IP":"172.17.0.10","Recovery port":4712} {"level":"info","ts":1568905702.0583296,"logger":"controller_wildflyserver","msg":"In-doubt transactions in object store","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod Name":"tx-server-0","Message":"Recovery scan to be invoked as the transaction log storage is not empty for pod scaling down pod tx-server-0, transaction list: map[0:ffffac11000a:991c183:5d8399a1:13:map[age-in-seconds:23 id:0:ffffac11000a:991c183:5d8399a1:13 jmx-name:<nil> participants:map[java:/MockXAResource:map[eis-product-name:MockXAResource Test eis-product-version:0.1.Mock jmx-name:<nil> jndi-name:java:/MockXAResource status:PREPARED type:/StateManager/AbstractRecord/XAResourceRecord]] type:StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/SubordinateAtomicAction/JCA]]"} {"level":"info","ts":1568905711.1034026,"logger":"controller_wildflyserver","msg":"Reconciling WildFlyServer","Request.Namespace":"msimka-namespace","Request.Name":"tx-server"} {"level":"info","ts":1568905711.103548,"logger":"controller_wildflyserver","msg":"Transaction recovery scaledown processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod Name":"tx-server-0","IP Address":"172.17.0.10"} {"level":"info","ts":1568905711.109706,"logger":"controller_wildflyserver","msg":"Executing recovery scan at tx-server-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod IP":"172.17.0.10","Recovery port":4712} {"level":"error","ts":1568905711.2608829,"logger":"controller_wildflyserver","msg":"Failures during scaling down recovery processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Desired replica size":0,"Number of pods to be removed":1,"error":"Found 1 errors:\n [[Failed to run transaction recovery scan for scaling down pod tx-server-0. Please, verify the pod log file. Error: Error to get response for command SCAN sending to 172.17.0.10:4712, error: EOF]],","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/wildfly/wildfly-operator/pkg/controller/wildflyserver.(*ReconcileWildFlyServer).Reconcile\n\t/go/src/github.com/wildfly/wildfly-operator/pkg/controller/wildflyserver/wildflyserver_controller.go:240\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:88"} {"level":"info","ts":1568905711.2609518,"logger":"controller_wildflyserver","msg":"Scaling down statefulset by verification if pods are clean by recovery","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server"} {"level":"info","ts":1568905711.2609615,"logger":"controller_wildflyserver","msg":"Statefulset was not fully scaled to the desired replica size 0 while StatefulSet is to be at size 1. Some pods were not cleaned by recovery. Verify status of the WildflyServer tx-server","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server"} {"level":"info","ts":1568905711.2795022,"logger":"controller_wildflyserver","msg":"Updating StatefulSet to be up to date with the WildFlyServer Spec","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server"} {"level":"info","ts":1568905711.2795491,"logger":"controller_wildflyserver","msg":"Reconciling WildFlyServer","Request.Namespace":"msimka-namespace","Request.Name":"tx-server"} {"level":"info","ts":1568905711.2796504,"logger":"controller_wildflyserver","msg":"Transaction recovery scaledown processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod Name":"tx-server-0","IP Address":"172.17.0.10"} {"level":"info","ts":1568905711.2937052,"logger":"controller_wildflyserver","msg":"Executing recovery scan at tx-server-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod IP":"172.17.0.10","Recovery port":4712} {"level":"error","ts":1568905711.294249,"logger":"controller_wildflyserver","msg":"Failures during scaling down recovery processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Desired replica size":0,"Number of pods to be removed":1,"error":"Found 1 errors:\n [[Failed to run transaction recovery scan for scaling down pod tx-server-0. Please, verify the pod log file. Error: Cannot process TCP connection to 172.17.0.10:4712, error: dial tcp 172.17.0.10:4712: connect: connection refused]],","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/wildfly/wildfly-operator/pkg/controller/wildflyserver.(*ReconcileWildFlyServer).Reconcile\n\t/go/src/github.com/wildfly/wildfly-operator/pkg/controller/wildflyserver/wildflyserver_controller.go:240\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:88"} {"level":"info","ts":1568905711.294342,"logger":"controller_wildflyserver","msg":"Scaling down statefulset by verification if pods are clean by recovery","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server"} {"level":"info","ts":1568905711.294417,"logger":"controller_wildflyserver","msg":"Statefulset was not fully scaled to the desired replica size 0 while StatefulSet is to be at size 1. Some pods were not cleaned by recovery. Verify status of the WildflyServer tx-server","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server"} {"level":"error","ts":1568905711.311673,"logger":"controller_wildflyserver","msg":"Failed to Update StatefulSet.","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server","error":"Operation cannot be fulfilled on statefulsets.apps \"tx-server\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/wildfly/wildfly-operator/pkg/controller/wildflyserver.(*ReconcileWildFlyServer).checkStatefulSet\n\t/go/src/github.com/wildfly/wildfly-operator/pkg/controller/wildflyserver/wildflyserver_controller.go:470\ngithub.com/wildfly/wildfly-operator/pkg/controller/wildflyserver.(*ReconcileWildFlyServer).Reconcile\n\t/go/src/github.com/wildfly/wildfly-operator/pkg/controller/wildflyserver/wildflyserver_controller.go:247\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:88"} {"level":"error","ts":1568905711.311745,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"wildflyserver-controller","request":"msimka-namespace/tx-server","error":"Operation cannot be fulfilled on statefulsets.apps \"tx-server\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:217\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:88"} {"level":"info","ts":1568905712.3137681,"logger":"controller_wildflyserver","msg":"Reconciling WildFlyServer","Request.Namespace":"msimka-namespace","Request.Name":"tx-server"} {"level":"info","ts":1568905712.3139439,"logger":"controller_wildflyserver","msg":"Transaction recovery scaledown processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod Name":"tx-server-0","IP Address":"172.17.0.10"} {"level":"info","ts":1568905712.3253288,"logger":"controller_wildflyserver","msg":"Executing recovery scan at tx-server-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Pod IP":"172.17.0.10","Recovery port":4712} {"level":"error","ts":1568905712.3255754,"logger":"controller_wildflyserver","msg":"Failures during scaling down recovery processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-server","Desired replica size":0,"Number of pods to be removed":1,"error":"Found 1 errors:\n [[Failed to run transaction recovery scan for scaling down pod tx-server-0. Please, verify the pod log file. Error: Cannot process TCP connection to 172.17.0.10:4712, error: dial tcp 172.17.0.10:4712: connect: connection refused]],","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/wildfly/wildfly-operator/pkg/controller/wildflyserver.(*ReconcileWildFlyServer).Reconcile\n\t/go/src/github.com/wildfly/wildfly-operator/pkg/controller/wildflyserver/wildflyserver_controller.go:240\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:215\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.1.12/pkg/internal/controller/controller.go:158\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:134\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190221213512-86fb29eff628/pkg/util/wait/wait.go:88"} {"level":"info","ts":1568905712.3256311,"logger":"controller_wildflyserver","msg":"Scaling down statefulset by verification if pods are clean by recovery","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server"} {"level":"info","ts":1568905712.3256419,"logger":"controller_wildflyserver","msg":"Statefulset was not fully scaled to the desired replica size 0 while StatefulSet is to be at size 1. Some pods were not cleaned by recovery. Verify status of the WildflyServer tx-server","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-server"}
- is related to
-
WFLY-12922 server scale down keeps data in client's data/ejb-xa-recovery and transactions on client aren't commited
- Closed