Martin Thompson first reported on the cost of contention using a simple benchmark that measures the time to increment a 64-bit counter 500 million times using various strategies. Results were reported here (section 3.1) and here (Managing Contention vs. Doing Real Work).
I re-implemented this benchmark here.
https://gist.github.com/nikolaybotevb/bc8cc1cdfa2f7cc212a915c487771d53
The results I observed (running on Java 9 with a 2017 MacBook Pro with a 2.9 GHz 7th Generation Kaby Lake Intel Core i7 processor) are comparable to those reported by Martin 7 years ago.
Method | Time (ms) Kaby Lake, Java 10 |
Time (ms) Westmere |
---|---|---|
Single thread | 70 | 300 |
Single thread with volatile | 2,700 | 4,700 |
Single thread with CAS | 3,500 | 5,700 |
Single thread with synchronized | 2,000 | |
Single thread with lock | 9,300 | 10,000 |
Two threads with CAS | 10,800 | 18,000 |
Two threads with synchronized | 22,400 | |
Two threads with lock | 52,500 | 118,000 |
While this micro-benchmark is not representative of real-world workloads (as explained here), tempted by its simplicity I plan to use it as the first benchmark to track optimizations to the air-java concurrency library. This would be followed up by a more comprehensive benchmark like this one, which measure both latency and throughput under various configurations, and finally a real-world application.