-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
OP-2.3.10.GA
-
None
-
False
-
None
-
False
-
-
-
-
-
-
-
Trying to run the application at https://github.com/kabir/eap-operator-tx-recovery-demo/tree/JBEAP-24814. I've also attached an archive of it.
I deploy everything as mentioned in the application README, scale the application to three pods, and then I make sure that pods 2 and 3 are in a transaction (steps below).
After almost an hour (5 hours yesterday) I still have three pods, with 2 and 3 in the SCALING_DOWN_RECOVERY_INVESTIGATION state:
% oc get wfly eap7-app --template={{.status}} -w map[hosts:[eap7-app-route-myproject.apps.cluster-l4fgj.l4fgj.sandbox309.opentlc.com] pods:[map[name:eap7-app-0 podIP:10.129.2.23 state:ACTIVE] map[name:eap7-app-1 podIP:10.128.2.13 state:ACTIVE] map[name:eap7-app-2 podIP:10.131.0.35 state:ACTIVE]] replicas:3 scalingdownPods:0 selector:app.kubernetes.io/name=eap7-app]map[hosts:[eap7-app-route-myproject.apps.cluster-l4fgj.l4fgj.sandbox309.opentlc.com] pods:[map[name:eap7-app-0 podIP:10.129.2.23 state:ACTIVE] map[name:eap7-app-1 podIP: state:ACTIVE] map[name:eap7-app-2 podIP:10.131.0.35 state:ACTIVE]] replicas:3 scalingdownPods:0 selector:app.kubernetes.io/name=eap7-app]map[hosts:[eap7-app-route-myproject.apps.cluster-l4fgj.l4fgj.sandbox309.opentlc.com] pods:[map[name:eap7-app-0 podIP:10.129.2.23 state:ACTIVE] map[name:eap7-app-1 podIP:10.128.2.14 state:ACTIVE] map[name:eap7-app-2 podIP:10.131.0.35 state:ACTIVE]] replicas:3 scalingdownPods:0 selector:app.kubernetes.io/name=eap7-app] --- SCALING TO 1 ---- map[hosts:[eap7-app-route-myproject.apps.cluster-l4fgj.l4fgj.sandbox309.opentlc.com] pods:[map[name:eap7-app-0 podIP:10.129.2.23 state:ACTIVE] map[name:eap7-app-1 podIP:10.128.2.14 state:ACTIVE] map[name:eap7-app-2 podIP:10.131.0.35 state:ACTIVE]] replicas:3 scalingdownPods:0 selector:app.kubernetes.io/name=eap7-app]map[hosts:[eap7-app-route-myproject.apps.cluster-l4fgj.l4fgj.sandbox309.opentlc.com] pods:[map[name:eap7-app-0 podIP:10.129.2.23 state:ACTIVE] map[name:eap7-app-1 podIP:10.128.2.14 state:SCALING_DOWN_RECOVERY_INVESTIGATION] map[name:eap7-app-2 podIP:10.131.0.35 state:SCALING_DOWN_RECOVERY_INVESTIGATION]] replicas:3 scalingdownPods:2 selector:app.kubernetes.io/name=eap7-app] map[hosts:[eap7-app-route-myproject.apps.cluster-l4fgj.l4fgj.sandbox309.opentlc.com] pods:[map[name:eap7-app-0 podIP:10.129.2.23 state:ACTIVE] map[name:eap7-app-1 podIP:10.128.2.40 state:SCALING_DOWN_RECOVERY_INVESTIGATION] map[name:eap7-app-2 podIP:10.128.2.41 state:SCALING_DOWN_RECOVERY_INVESTIGATION]] replicas:3 scalingdownPods:2 selector:app.kubernetes.io/name=eap7-app]
As I understand it, this should take only a minute.
The commands to get to this stage are:
[~/sourcecontrol/eap-operator-tx-recovery-demo]
% ./demo.sh add one
--- SNIP ---
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 202 Accepted
--- SNIP ---
eap7-app-1%
The above was added on pod 1. Now we try another
[~/sourcecontrol/eap-operator-tx-recovery-demo]
% ./demo.sh add two
--- SNIP ---
< HTTP/1.1 202 Accepted
--- SNIP ---
eap7-app-2%
The above was added on pod 2. Now we try another add
% ./demo.sh add three ---- SNIP ---- * < HTTP/1.1 409 Conflict ---- SNIP ----
The above hit one of the already done pods (2 or 1). So we try again:
% ./demo.sh add three --- SNIP --- < HTTP/1.1 202 Accepted --- SNIP --- eap7-app-0%
This worked and was added on pod 0. Now we release pod 0's transaction (pods 1 and 2 are still hanging)
% ./demo.sh release 0
Now I try to scale the pods to 1 (from 3).
Looking in the logs for pod 1, it looks like the pod is terminated before the EAP instance has a chance to be brought up. The attached logs below contains a few attempts at running oc logs -f eap7-app-1 (it seems to disconnect when the pod is terminated). Look for
ERROR *** WildFly wrapper process (1) received TERM signal ***
to see where OpenShift stops the pod.
- is related to
-
JBEAP-24448 Operator TX recovery facility does not work with KitchenSink quickstart
- Closed