Watch the presentation of Josh Block called Performance Anxiety on parleys: https://www.parleys.com/tutorial/performance-anxiety
Running a single benchmark just once is unreliable:
- 2 JVM processes on the same hardware for the same code can behave very differently performance wise (= score calculation count per second)
- If the randomSeed isn't fixed (for example in PRODUCTION mode), a different randomSeed will influence score quality to certain degree.
Our benchmarks needs to be to show the impact of this, by easily allowing to do every single benchmark n times. The benchmark report should show the average, the minimum & maximum (maybe even with a candle stick diagram?) and maybe even the raw result of every separate single benchmark run. Requirements needs to be discussed further before implementation starts.
After this is implemented, this feature can be used to validate or invalidate the information in old blog posts that just did 1 single benchmark and presumed it was representative: