blob: 75e33552dc304a23dcae41484985f2ad368f8631 [file] [log] [blame]
onqtam8126b562016-05-27 17:01:15 +03001<!DOCTYPE html>
2<html>
3<title>benchmarks</title>
4<xmp theme="united" style="display:none;">
5
onqtamb8220c52017-05-16 00:21:15 +03006# Benchmarks
onqtam1435c012016-09-21 15:29:11 +03007
onqtamb8220c52017-05-16 00:21:15 +03008The benchmarks are done with [**this**](../../scripts/bench/bench.py) script using CMake. There are 3 benchmarking scenarios:
onqtamb6c69672016-09-21 15:41:52 +03009
onqtam1435c012016-09-21 15:29:11 +030010- [the cost of including the header](#cost-of-including-the-header)
11- [the cost of an assertion macro](#cost-of-an-assertion-macro)
onqtamb8220c52017-05-16 00:21:15 +030012- [runtime speed of lots of asserts](#runtime-benchmarks)
onqtam1435c012016-09-21 15:29:11 +030013
14Compilers used:
onqtamb6c69672016-09-21 15:41:52 +030015
onqtam6ff3d352017-09-11 14:35:06 +030016- WINDOWS: Microsoft Visual Studio Community 2017 - Version 15.3.3+26730.12
17- WINDOWS: gcc 7.1.0 (x86_64-posix-seh-rev2, Built by MinGW-W64 project)
onqtamb8220c52017-05-16 00:21:15 +030018- LINUX: gcc 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)
19- LINUX: clang 4.0.0-1 (tags/RELEASE_400/rc1) Target: x86_64-pc-linux-gnu
onqtam1435c012016-09-21 15:29:11 +030020
onqtamb6c69672016-09-21 15:41:52 +030021Environment used (Intel i7 3770k, 16g RAM):
22
onqtam1435c012016-09-21 15:29:11 +030023- Windows 7 - on an SSD
onqtamb8220c52017-05-16 00:21:15 +030024- Ubuntu 17.04 in a VirtualBox VM - on a HDD
onqtam1435c012016-09-21 15:29:11 +030025
onqtam6ff3d352017-09-11 14:35:06 +030026**doctest** version: 1.2.2 (released on 2017.09.05)
onqtam1435c012016-09-21 15:29:11 +030027
onqtam6ff3d352017-09-11 14:35:06 +030028[**Catch**](https://github.com/philsquared/Catch) version: 2.0.0-develop.3 (released on 2017.08.30)
onqtamb8220c52017-05-16 00:21:15 +030029
30# Compile time benchmarks
onqtam1435c012016-09-21 15:29:11 +030031
32## Cost of including the header
onqtam8126b562016-05-27 17:01:15 +030033
34This is a benchmark that is relevant only to single header and header only frameworks - like **doctest** and [**Catch**](https://github.com/philsquared/Catch).
35
onqtam1435c012016-09-21 15:29:11 +030036The script generates 201 source files and in 200 of them makes a function in the form of ```int f135() { return 135; }``` and in ```main.cpp``` it forward declares all the 200 such dummy functions and accumulates their result to return from the ```main()``` function. This is done to ensure that all source files are built and that the linker doesn't remove/optimize anything.
onqtam8126b562016-05-27 17:01:15 +030037
onqtam1435c012016-09-21 15:29:11 +030038- **baseline** - how much time the source files need for a single threaded build with ```msbuild```/```make```
39- **+ implement** - only in ```main.cpp``` the header is included with a ```#define``` before it so the test runner gets implemented:
onqtam8126b562016-05-27 17:01:15 +030040
onqtamb8220c52017-05-16 00:21:15 +030041```
onqtam8126b562016-05-27 17:01:15 +030042#define DOCTEST_CONFIG_IMPLEMENT_WITH_MAIN
43#include "doctest.h"
onqtamb8220c52017-05-16 00:21:15 +030044```
onqtam8126b562016-05-27 17:01:15 +030045- **+ header everywhere** - the framework header is also included in all the other source files
onqtam6ff3d352017-09-11 14:35:06 +030046- **+ disabled** - remove everything testing-related from the binary
onqtam8126b562016-05-27 17:01:15 +030047
onqtam1435c012016-09-21 15:29:11 +030048| doctest | baseline | + implement | + header everywhere | + disabled |
49|---------------------|----------|-------------|---------------------|------------|
onqtam6ff3d352017-09-11 14:35:06 +030050| MSVC Debug | 6.77 | 8.28 | 11.73 | 8.73 |
51| MSVC Release | 6.35 | 8.57 | 12.18 | 8.28 |
52| MinGW GCC Debug | 10.23 | 13.03 | 17.62 | 12.29 |
53| MinGW GCC Release | 10.33 | 13.68 | 17.87 | 13.11 |
54| Linux GCC Debug | 5.01 | 6.24 | 10.48 | 6.49 |
55| Linux GCC Release | 4.58 | 7.30 | 11.70 | 7.41 |
56| Linux Clang Debug | 8.80 | 9.70 | 14.92 | 10.89 |
57| Linux Clang Release | 9.29 | 12.05 | 17.51 | 11.56 |
onqtam8126b562016-05-27 17:01:15 +030058
onqtam6ff3d352017-09-11 14:35:06 +030059| Catch | baseline | + implement | + header everywhere | + disabled |
60|---------------------|----------|-------------|---------------------|------------|
61| MSVC Debug | 6.78 | 10.00 | 107.85 | 115.05 |
62| MSVC Release | 6.36 | 11.19 | 102.69 | 109.06 |
63| MinGW GCC Debug | 10.36 | 41.83 | 124.41 | 126.70 |
64| MinGW GCC Release | 10.49 | 21.93 | 97.81 | 105.47 |
65| Linux GCC Debug | 4.40 | 12.39 | 94.34 | 93.68 |
66| Linux GCC Release | 4.55 | 15.75 | 94.28 | 93.80 |
67| Linux Clang Debug | 9.30 | 15.00 | 105.84 | 103.05 |
68| Linux Clang Release | 9.68 | 22.75 | 114.36 | 111.32 |
onqtamb8220c52017-05-16 00:21:15 +030069
70<img src="../../scripts/data/benchmarks/header.png" width="430" align="right">
71<img src="../../scripts/data/benchmarks/implement.png" width="430">
onqtam8126b562016-05-27 17:01:15 +030072
onqtam1435c012016-09-21 15:29:11 +030073### Conclusion
onqtam8126b562016-05-27 17:01:15 +030074
onqtam1435c012016-09-21 15:29:11 +030075#### doctest
onqtam8126b562016-05-27 17:01:15 +030076
onqtamb8220c52017-05-16 00:21:15 +030077- instantiating the test runner in one source file costs ~1.5-3 seconds ```implement - baseline```
78- the inclusion of ```doctest.h``` in one source file costs between 20ms - 30ms ```(header_everywhere - implement) / 200```
79- including the library everywhere but everything disabled costs less than 3 seconds ```disabled - baseline``` for 200 files
onqtam8126b562016-05-27 17:01:15 +030080
onqtam1435c012016-09-21 15:29:11 +030081#### [Catch](https://github.com/philsquared/Catch)
onqtam8126b562016-05-27 17:01:15 +030082
onqtamb8220c52017-05-16 00:21:15 +030083- instantiating the test runner in one source file costs ~4-8 seconds ```implement - baseline```
84- the inclusion of ```catch.hpp``` in one source file costs between 300ms - 575ms ```(header_everywhere - implement) / 200```
onqtam8126b562016-05-27 17:01:15 +030085
86----------
87
onqtamb8220c52017-05-16 00:21:15 +030088So if ```doctest.h``` costs 20ms and ```catch.hpp``` costs 560ms on MSVC - then the **doctest** header is >> **28** << times lighter (for MSVC)!
onqtam8126b562016-05-27 17:01:15 +030089
90----------
91
92The results are in seconds and are in **no way** intended to bash [**Catch**](https://github.com/philsquared/Catch) - the **doctest** framework wouldn't exist without it.
93
onqtam1435c012016-09-21 15:29:11 +030094The reason the **doctest** header is so light on compile times is because it forward declares everything and doesn't drag any headers in the source files (except for the source file where the test runner gets implemented). This was a key design decision.
onqtam8126b562016-05-27 17:01:15 +030095
onqtam1435c012016-09-21 15:29:11 +030096## Cost of an assertion macro
onqtam8126b562016-05-27 17:01:15 +030097
onqtam1435c012016-09-21 15:29:11 +030098The script generates 11 ```.cpp``` files and in 10 of them makes 50 test cases with 100 asserts in them (of the form ```CHECK(a==b)``` where ```a``` and ```b``` are always the same ```int``` variables) - **50k** asserts! The testing framework gets implemented in ```main.cpp```.
onqtam8126b562016-05-27 17:01:15 +030099
onqtam1435c012016-09-21 15:29:11 +0300100- **baseline** - how much time a single threaded build takes with the header included everywhere - no test cases or asserts!
101- ```CHECK(a==b)``` - will add ```CHECK()``` asserts which decompose the expression with template machinery
onqtam8126b562016-05-27 17:01:15 +0300102
onqtam1435c012016-09-21 15:29:11 +0300103**doctest** specific:
onqtam8126b562016-05-27 17:01:15 +0300104
onqtam1435c012016-09-21 15:29:11 +0300105- ```CHECK_EQ(a,b)``` - will use ```CHECK_EQ(a,b)``` instead of the expression decomposing ones
106- ```FAST_CHECK_EQ(a,b)``` - will use ```FAST_CHECK_EQ(a,b)``` instead of the expression decomposing ones
107- **+faster** - will add [**```DOCTEST_CONFIG_SUPER_FAST_ASSERTS```**](configuration.html#doctest_config_super_fast_asserts) which speeds up ```FAST_CHECK_EQ(a,b)``` even more
108- **+disabled** - all test case and assert macros will be disabled with [**```DOCTEST_CONFIG_DISABLE```**](configuration.html#doctest_config_disable)
onqtam8126b562016-05-27 17:01:15 +0300109
onqtamb8220c52017-05-16 00:21:15 +0300110[**Catch**](https://github.com/philsquared/Catch) specific:
111
112- **+faster** - will add [**```CATCH_CONFIG_FAST_COMPILE```**](https://github.com/philsquared/Catch/blob/master/docs/configuration.html#catch_config_fast_compile) which speeds up the compilation of the normal asserts ```CHECK(a==b)```
onqtam6ff3d352017-09-11 14:35:06 +0300113- **+disabled** - all test case and assert macros will be disabled with **```CATCH_CONFIG_DISABLE```**
onqtamb8220c52017-05-16 00:21:15 +0300114
onqtam1435c012016-09-21 15:29:11 +0300115| doctest | baseline | ```CHECK(a==b)``` | ```CHECK_EQ(a,b)``` | ```FAST_CHECK_EQ(a,b)``` | +faster | +disabled |
116|---------------------|----------|-------------------|---------------------|--------------------------|---------|-----------|
onqtam6ff3d352017-09-11 14:35:06 +0300117| MSVC Debug | 3.08 | 23.72 | 18.15 | 8.38 | 5.67 | 2.23 |
118| MSVC Release | 3.61 | 43.75 | 24.28 | 11.36 | 7.22 | 2.15 |
119| MinGW GCC Debug | 3.90 | 85.47 | 58.62 | 24.40 | 12.12 | 1.71 |
120| MinGW GCC Release | 4.51 | 224.49 | 148.84 | 47.25 | 18.73 | 2.40 |
121| Linux GCC Debug | 2.01 | 78.38 | 50.61 | 17.62 | 9.87 | 1.11 |
122| Linux GCC Release | 3.20 | 199.78 | 123.42 | 32.47 | 19.52 | 1.97 |
123| Linux Clang Debug | 1.71 | 77.39 | 49.97 | 17.60 | 7.57 | 1.18 |
124| Linux Clang Release | 3.64 | 136.82 | 80.19 | 20.72 | 12.34 | 1.45 |
onqtam8126b562016-05-27 17:01:15 +0300125
onqtamb8220c52017-05-16 00:21:15 +0300126And here is [**Catch**](https://github.com/philsquared/Catch) which only has normal ```CHECK(a==b)``` asserts:
onqtam8126b562016-05-27 17:01:15 +0300127
onqtam6ff3d352017-09-11 14:35:06 +0300128| Catch | baseline | ```CHECK(a==b)``` | +faster | +disabled |
129|---------------------|----------|-------------------|---------|-----------|
130| MSVC Debug | 9.58 | 37.69 | 25.21 | 10.40 |
131| MSVC Release | 10.85 | 260.55 | 121.38 | 11.56 |
132| MinGW GCC Debug | 36.24 | 159.15 | 133.98 | 33.57 |
133| MinGW GCC Release | 16.15 | 740.71 | 562.60 | 16.41 |
134| Linux GCC Debug | 12.71 | 142.92 | 108.07 | 12.05 |
135| Linux GCC Release | 15.62 | 825.42 | 612.06 | 15.51 |
136| Linux Clang Debug | 10.48 | 115.19 | 89.59 | 10.78 |
137| Linux Clang Release | 18.25 | 393.31 | 316.98 | 17.19 |
onqtamb8220c52017-05-16 00:21:15 +0300138
139<img src="../../scripts/data/benchmarks/asserts.png">
onqtam1435c012016-09-21 15:29:11 +0300140
141### Conclusion
142
onqtamb8220c52017-05-16 00:21:15 +0300143**doctest**:
onqtam1435c012016-09-21 15:29:11 +0300144
onqtamb8220c52017-05-16 00:21:15 +0300145- is around 30% faster than [**Catch**](https://github.com/philsquared/Catch) when using normal expression decomposing ```CHECK(a==b)``` asserts
146- asserts of the form ```CHECK_EQ(a,b)``` with no expression decomposition - around 25%-45% faster than ```CHECK(a==b)```
147- fast asserts like ```FAST_CHECK_EQ(a,b)``` with no ```try/catch``` blocks - around 60-80% faster than ```CHECK_EQ(a,b)```
148- the [**```DOCTEST_CONFIG_SUPER_FAST_ASSERTS```**](configuration.html#doctest_config_super_fast_asserts) identifier which makes the fast assertions even faster by another 50-80%
onqtam1435c012016-09-21 15:29:11 +0300149- using the [**```DOCTEST_CONFIG_DISABLE```**](configuration.html#doctest_config_disable) identifier the assertions just disappear as if they were never written
onqtam8126b562016-05-27 17:01:15 +0300150
onqtamb8220c52017-05-16 00:21:15 +0300151[**Catch**](https://github.com/philsquared/Catch):
152
153- using [**```CATCH_CONFIG_FAST_COMPILE```**](https://github.com/philsquared/Catch/blob/master/docs/configuration.html#catch_config_fast_compile) results in 10%-40% faster build times for asserts.
154
155## Runtime benchmarks
156
157The runtime benchmarks consist of a single test case with a loop of 10 million iterations performing the task - a single normal assert (using expression decomposition) or the assert + the logging of the loop iterator ```i```:
158
159```
160for(int i = 0; i < 10000000; ++i)
161 CHECK(i == i);
162```
163
164or
165
166```
167for(int i = 0; i < 10000000; ++i) {
168 INFO(i);
169 CHECK(i == i);
170}
171```
172
173Note that the assert always passes - the goal should be to optimize for the common case - lots of passing test cases and a few that maybe fail.
174
onqtam4aff18c2017-05-17 04:10:03 +0300175| doctest | assert | + info | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; | Catch | assert | + info |
176|---------------------|---------|---------|-|---------------------|---------|---------|
onqtam6ff3d352017-09-11 14:35:06 +0300177| MSVC Debug | 5.04 | 13.03 | | MSVC Debug | 101.07 | 338.41 |
178| MSVC Release | 0.73 | 1.67 | | MSVC Release | 1.75 | 10.99 |
179| MinGW GCC Debug | 2.11 | 4.50 | | MinGW GCC Debug | 4.76 | 18.22 |
180| MinGW GCC Release | 0.36 | 0.86 | | MinGW GCC Release | 1.24 | 7.29 |
181| Linux GCC Debug | 2.49 | 4.97 | | Linux GCC Debug | 5.41 | 19.01 |
182| Linux GCC Release | 0.29 | 0.66 | | Linux GCC Release | 1.20 | 7.88 |
183| Linux Clang Debug | 2.39 | 4.76 | | Linux Clang Debug | 5.12 | 17.66 |
184| Linux Clang Release | 0.39 | 0.70 | | Linux Clang Release | 0.99 | 7.26 |
onqtamb8220c52017-05-16 00:21:15 +0300185
186<img src="../../scripts/data/benchmarks/runtime_info.png" width="430" align="right">
187<img src="../../scripts/data/benchmarks/runtime_assert.png" width="430">
188
onqtam4aff18c2017-05-17 04:10:03 +0300189Note that in these graphs the values for ```MSVC Release``` for **Catch** are 10 times smaller than the real ones (from the tables above) because google spreadsheet didn't allow me to create a bar chart with values that were so different.
190
onqtamb8220c52017-05-16 00:21:15 +0300191### Conclusion
192
193**doctest** is significantly faster - between 4 and 40 times.
194
195In these particular cases **doctest** makes 0 allocations when the assert doesn't fail - it uses lazy stringification (meaning it stringifies the expression or the logged loop counter only if it has to) and a small-buffer optimized string class to achieve these results.
196
onqtambf4f4532016-09-22 21:24:54 +0300197----------
198
199If you want a benchmark that is not synthetic - check out [**this blog post**](http://baptiste-wicht.com/posts/2016/09/blazing-fast-unit-test-compilation-with-doctest-11.html) of [**Baptiste Wicht**](https://github.com/wichtounet) who tested the compile times of the asserts in the 1.1 release with his [**Expression Templates Library**](https://github.com/wichtounet/etl)!
200
201While reading the post - keep in mind that if a part of a process takes 50% of the time and is made 10000 times faster - the overall process would still be only roughly 50% faster.
202
onqtam8126b562016-05-27 17:01:15 +0300203---------------
204
205[Home](readme.html#reference)
206
207
208</xmp>
209<script src="strapdown.js/strapdown.js"></script>
210</html>