Uploaded image for project: 'Cloud Enablement'
  1. Cloud Enablement
  2. CLOUD-2789

Problems with cluster scale-down when SYM_ENCRYPT configured

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • EAP72 1.0.BETA, EAPCD 13.0.GA
    • EAP7, EAP_CD
    • None
    • Cloud Sprint 32, Cloud Sprint 33, Cloud Sprint 34, Cloud Sprint 35, Cloud Sprint 36, Cloud Sprint 37

      We are seeing problems scaling an EAP cluster under load down from 3 to 2 pods when SYM_ENCRYPT is enabled. We see this when testing an EAP on OS 7.2 Beta test image, and when testing test images that incorporate a build of the current WildFly master. We believe these problems probably go back to CD 12, e.g. see the discussion of CLOUD-2417.

      A characteristic of the issue is lots of messages like these in the server logs:

      [0m�[31m16:02:04,996 ERROR [org.jgroups.protocols.SYM_ENCRYPT] (thread-9,ee,hsc-1-z9f7g) hsc-1-z9f7g: received message without encrypt header from hsc-1-8dfqx; dropping it

      The condition results in failed requests, so it's not just log noise.

      Following are instructions from kwills@redhat.com on how to reproduce this:

      "I've pushed a WFLY image to the internal registry if anyone wants to give it a go:

      docker pull docker-registry.engineering.redhat.com/kwills/eap-cd-openshift:WFLY

      To reproduce the problem:

      docker pull docker-registry.engineering.redhat.com/kwills/eap-cd-openshift:WFLY
      docker tag docker-registry.engineering.redhat.com/kwills/eap-cd-openshift:WFLY jboss-eap-7-tech-preview/eap-cd-openshift:13.0
      oc cluster up
      run setup-ocp.sh from https://github.com/luck3y/openshift-util-scripts (clone the repo, then run it from inside the repo dir, this will set up your local env)
      run:
      oc -n myproject new-app eap-cd-https-s2i \
      -p APPLICATION_NAME=eap-clustering-test-1 \
      -p JGROUPS_ENCRYPT_SECRET=eap7-app-secret \
      -p JGROUPS_ENCRYPT_NAME="secret-key" \
      -p JGROUPS_ENCRYPT_PASSWORD="password"

      This will build and deploy an image using kitchensink, I had some issues getting the S2I builds working from CEE, so I just built the artifact and deployed it manually (attached)

      Scale up application to 3
      $ oc scale --replicas=3 dc/eap-clustering-test-1
      Deploy ROOT.war (its in a directory called deployments locally)
      for i in `oc get pods | grep -v build | grep -v NAME| awk '

      {print $1}

      '`
      do
      echo $i
      oc rsync ./deployments/ $i:/deployments/
      done

      Wait for deployment to complete, then start making requests:
      for i in `seq 9999`
      do
      curl -c cookies -b cookies "http://eap-clustering-test-1-myproject.127.0.0.1.nip.io/Counter?requestId=$i";
      done

      While requests are executing, scale down to 2:
      $ oc scale --replicas=2 dc/eap-clustering-test-1
      (Just remember if you use this method, you'll need to redeploy with rsync if you bring up new pods.)

      One pod will terminate, and exceptions will begin referencing the terminated pod in the others, requests are either blocked or return a 503 until the application is scaled all the way down, then back up again."

      The image Ken refers to there is a test image that packages current WildFly master. To try a test image containing EAP 7.2 Beta, use docker-registry.engineering.redhat.com/bstansbe/eap72-beta-openshift:CLOUD-2694.

      I'll attach the deployment Ken referred to. I'll also attach test output (e.g. log files etc) of tests run against the 7.2 Beta test image and against the WF master test image, the latter with and without SYM_ENCRYPT.

        1. ROOT.war
          7 kB
        2. CLOUD-2694-7.2.beta-scale-down-3-to-2.tar.gz
          1.27 MB
        3. wf-image-no-jgroups-encrypt.zip
          497 kB
        4. wf-image-logs.zip
          1.52 MB
        5. eap-clustering-2-1-97v2q.log
          66 kB
        6. eap-clustering-2-1-rrpnk.log
          122 kB
        7. jgroups.jceks
          0.5 kB
        8. standalone-openshift-97v2q.xml
          37 kB
        9. standalone-openshift-rrpnk.xml
          37 kB

              rhn-engineering-rhusar Radoslav Husar
              bstansbe@redhat.com Brian Stansberry
              Marek Schmidt Marek Schmidt
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: