-
Enhancement
-
Resolution: Done
-
Major
-
None
-
None
-
None
We can utilize a branchless parser for VarInt instead of a while loop for decoding. This should help the CPU with branch prediction. A JMH benchmark is available on [1] with the following results:
Benchmark (elementType) (inputDistribution) (inputs) Mode Cnt Score Error Units NumericParserBenchmark.parseNumberBranchless INT SMALL 1 avgt 20 4.810 ± 0.223 ns/op NumericParserBenchmark.parseNumberInfinispan INT SMALL 1 avgt 20 5.298 ± 0.539 ns/op NumericParserBenchmark.parseNumberBranchless INT SMALL 128 avgt 20 4.164 ± 0.195 ns/op NumericParserBenchmark.parseNumberInfinispan INT SMALL 128 avgt 20 6.688 ± 2.269 ns/op NumericParserBenchmark.parseNumberBranchless INT SMALL 128000 avgt 20 9.111 ± 0.593 ns/op NumericParserBenchmark.parseNumberInfinispan INT SMALL 128000 avgt 20 15.544 ± 1.132 ns/op NumericParserBenchmark.parseNumberBranchless INT LARGE 1 avgt 20 6.619 ± 0.289 ns/op NumericParserBenchmark.parseNumberInfinispan INT LARGE 1 avgt 20 14.317 ± 1.525 ns/op NumericParserBenchmark.parseNumberBranchless INT MEDIUM 1 avgt 20 3.613 ± 0.129 ns/op NumericParserBenchmark.parseNumberInfinispan INT MEDIUM 1 avgt 20 4.670 ± 0.533 ns/op NumericParserBenchmark.parseNumberBranchless INT MEDIUM 128 avgt 20 4.772 ± 1.000 ns/op NumericParserBenchmark.parseNumberInfinispan INT MEDIUM 128 avgt 20 7.183 ± 1.055 ns/op NumericParserBenchmark.parseNumberBranchless INT MEDIUM 128000 avgt 20 11.379 ± 0.620 ns/op NumericParserBenchmark.parseNumberInfinispan INT MEDIUM 128000 avgt 20 16.079 ± 0.971 ns/op NumericParserBenchmark.parseNumberBranchless INT ALL 1 avgt 20 3.759 ± 0.267 ns/op NumericParserBenchmark.parseNumberInfinispan INT ALL 1 avgt 20 5.182 ± 0.792 ns/op NumericParserBenchmark.parseNumberBranchless INT ALL 128 avgt 20 5.842 ± 0.149 ns/op NumericParserBenchmark.parseNumberInfinispan INT ALL 128 avgt 20 8.953 ± 1.271 ns/op NumericParserBenchmark.parseNumberBranchless INT ALL 128000 avgt 20 16.172 ± 1.114 ns/op NumericParserBenchmark.parseNumberInfinispan INT ALL 128000 avgt 20 17.555 ± 1.504 ns/op NumericParserBenchmark.parseNumberBranchless LONG SMALL 1 avgt 20 5.394 ± 0.255 ns/op NumericParserBenchmark.parseNumberInfinispan LONG SMALL 1 avgt 20 7.804 ± 0.524 ns/op NumericParserBenchmark.parseNumberBranchless LONG SMALL 128 avgt 20 5.676 ± 0.396 ns/op NumericParserBenchmark.parseNumberInfinispan LONG SMALL 128 avgt 20 7.493 ± 1.073 ns/op NumericParserBenchmark.parseNumberBranchless LONG SMALL 128000 avgt 20 12.252 ± 1.023 ns/op NumericParserBenchmark.parseNumberInfinispan LONG SMALL 128000 avgt 20 16.152 ± 0.754 ns/op NumericParserBenchmark.parseNumberBranchless LONG LARGE 1 avgt 20 6.225 ± 0.232 ns/op NumericParserBenchmark.parseNumberInfinispan LONG LARGE 1 avgt 20 18.368 ± 1.396 ns/op NumericParserBenchmark.parseNumberBranchless LONG MEDIUM 1 avgt 20 4.283 ± 0.129 ns/op NumericParserBenchmark.parseNumberInfinispan LONG MEDIUM 1 avgt 20 6.546 ± 0.623 ns/op NumericParserBenchmark.parseNumberBranchless LONG MEDIUM 128 avgt 20 6.917 ± 0.386 ns/op NumericParserBenchmark.parseNumberInfinispan LONG MEDIUM 128 avgt 20 10.355 ± 1.676 ns/op NumericParserBenchmark.parseNumberBranchless LONG MEDIUM 128000 avgt 20 14.167 ± 0.840 ns/op NumericParserBenchmark.parseNumberInfinispan LONG MEDIUM 128000 avgt 20 18.337 ± 1.120 ns/op NumericParserBenchmark.parseNumberBranchless LONG ALL 1 avgt 20 6.648 ± 0.169 ns/op NumericParserBenchmark.parseNumberInfinispan LONG ALL 1 avgt 20 16.557 ± 2.431 ns/op NumericParserBenchmark.parseNumberBranchless LONG ALL 128 avgt 20 6.811 ± 0.096 ns/op NumericParserBenchmark.parseNumberInfinispan LONG ALL 128 avgt 20 12.062 ± 1.354 ns/op NumericParserBenchmark.parseNumberBranchless LONG ALL 128000 avgt 20 14.776 ± 0.908 ns/op NumericParserBenchmark.parseNumberInfinispan LONG ALL 128000 avgt 20 20.272 ± 1.042 ns/op
This new approach saves a few NSs in all the cases we test.
[1] https://github.com/infinispan/infinispan-benchmarks/pull/23