Status: Resolved (View Workflow)
Affects Version/s: 3.3.0.027, 3.3.0.026, 3.3.0.024, 3.3.0.023, 3.3.0.022, 3.3.0.021
Fix Version/s: 3.3.0.029
Steps to Reproduce:
The basics of my JMeter test plan are included in this ticket description. It's not a strong load, but reproducing the errors seems somewhat intermittent and requires at least some light load/concurrency on the Mura sites. However, it is easily and consitently reproduced with my JMeter test plan on various Railo patches/versions, as reported.The basics of my JMeter test plan are included in this ticket description. It's not a strong load, but reproducing the errors seems somewhat intermittent and requires at least some light load/concurrency on the Mura sites. However, it is easily and consitently reproduced with my JMeter test plan on various Railo patches/versions, as reported.
Our Mura CMS application started throwing ColdSpring exceptions, like the following, starting with Railo 3.3.0.021, getting drastically worse in Railo 3.3.0.022, and continuing to remain an issue through Railo 3.3.0.027:
there is a timeout occurred on a exclusive lock with name [bf_E805A31B-57C8-4D74-9687FAC3F180B796.bean_contentBean] after 10 seconds
there is a timeout occurred on a exclusive lock with name [bf_E805A31B-57C8-4D74-9687FAC3F180B796.bean_contentBean] after 60 seconds
We've been running Railo 3.3.0.015 on two different Mura sites for months, and I do recall seeing the above type of error, but extremely rarely - maybe once per month. When we upgraded two different servers to Railo 3.3.0.027 last week we immediately started to see a flood of these errors from both servers. So, it was clearly related to the Railo 3.3.0.027 patch, because these were two completely different servers running different Mura sites, with different versions of Mura. We cleared out all possible cache, restarted our Tomcat instances numerous times, and upgraded Mura to the latest release on one of the servers--the same ColdSpring errors listed above persisted. We removed the Railo 3.3.0.027 patch (rolling us back to 3.3.0.015) and the errors immediately ceased.
We already had a JMeter test plan, which had always run successfully, so I was easily able to repeat this test plan in an isolated server environment for all Railo patches (from 3.3.0.015 through 3.3.0.027, less a missing 3.3.0.025). The test plan's thread group used 4 threads (users) with a ramp-up period of 60 seconds, and the test plan looped 4 times; in a nutshell, 96 requests by 4 users over the course of approximately 5 minutes.
So, I was able to run our JMeter test plan in an isolated environment for every available Railo patch from 3.3.0.015 to 3.3.0.027. I cleared all possible cache and restarted our servlet container between each patch and test run (this was all scripted and therefore accurately repeated for each). Below is a summary of my findings, and I'll also attach a slightly more detailed text output that shows specific incident counts for various exceptions (though they're all essentially the same two errors shown above).
These patches were good (ran our test plan without any errors):
These patches were bad (test plan consistently produced errors):
- 3.3.0.021 – 5% exceptions
- 3.3.0.022 – 86% exceptions
- 3.3.0.023 – 50% exceptions
- 3.3.0.024 – 46% exceptions
- 3.3.0.025 – N.A. (404 error from patch download URL)
- 3.3.0.026 – 59% exceptions
- 3.3.0.027 – 50% exceptions
As you can see, 3.3.0.021 and 3.3.0.022 seem to be outliers (and I repeated the test plan for those a few times, with the same results). So, 021 seems to just barely start to show this issue (4% exceptions), while 022 seemed to have major problems (86% exceptions); and 023-027 are all hovering in the 50% range for exceptions. Again, please also see the attached report for a little more detail. I hope this helps you get to the bottom of this issue!