Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-23739

StatefulSet should not be modified while a pod is scaling down

    XMLWordPrintable

Details

    • False
    • None
    • False

    Description

      Context
      When EAP is used in K8s (together with its operator), the right way to trigger the transaction recovery is to decrease the number of replicas defined in the Custom Resource (CR) representing the EAP server. In this way, the operator guarantees that a pod (or multiple pods if the user decides to scale down more than one server) will be scaled down only when there are no transactions left in the object store.
       
      Reproducer
      AFAIK, there is a scenario where the operator does not make sure that the pod(s) (doomed to be scaled down) will be removed only when the recovery of transactions is completed.
       
      Steps to reproduce this scenario:

      • Initiate the scaledown of a EAP pod that has in-doubt transactions (basically, set the replicas value of the CR to 0)
      • Modify the value of StatefulSet’s replicas to match the value defined in the CR
      • Result -> the operator is not able to recreate the EAP pod and continue the recovery of the in-doubt transactions
        • [NB: In case `oc delete pod tx-*` is executed while the operator is waiting for the Object Store to become empty, the StatefulSet guarantees that the pod will be recreated; in this case, we are covered: the operator restarts the transaction recovery of the new pod]

       
      The purpose of this ticket
      The documentation of the EAP Operator explains the right procedure to make sure that transaction recovery is carried out. Nevertheless, this note should be modified:
       
      Decreasing the replica size of the StatefulSet or deleting the pod itself has no effect and such changes are reverted.

       In fact, when the StatefulSet is modified while the operator is controlling the scaling down of a EAP pod, the existence of the scaling down pod will be not guaranteed. I propose to modify the note with something like this:
       
      Deleting the pod itself has no effect and such changes are reverted. Also decreasing the replica size of the StatefulSet has no effect and such changes are reverted. Nevertheless, there is a corner case to be considered: when the replica size of the StatefulSet is decreased while the Operator has started the (artificial) scaling down of a pod connected to the StatefulSet, this modification will stop the transaction recovery immediately as the pod gets removed abruptly

      Attachments

        Activity

          People

            rhn-support-rchettri Rahuul Chettri
            jfinelli@redhat.com Manuel Finelli
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: