bitcoin-bitcoin-core

mirror of https://github.com/bitcoin/bitcoin.git synced 2025-02-06 10:18:44 -05:00

Author	SHA1	Message	Date
Ryan Ofsky	824e1ffa9f	bench: Represents paths with fs::path instead of std::string Also uses fs::path quoting in bench printed strings and fixes a misleading error message. Originally suggested https://github.com/bitcoin/bitcoin/pull/20744#issuecomment-1022486215 Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>	2022-02-04 09:33:41 -05:00
Hennadii Stepanov	f47dda2c58	scripted-diff: Bump copyright headers -BEGIN VERIFY SCRIPT- ./contrib/devtools/copyright_header.py update ./ -END VERIFY SCRIPT- Commits of previous years: * 2020: `fa0074e2d8` * 2019: `aaaaad6ac9`	2021-12-30 19:36:57 +02:00
Jon Atack	da4e2f1da0	bench: various args improvements - use ALLOW_BOOL for -list arg instead of ALLOW_ANY - touch up `-asymptote=<n1,n2,n3...>` help - pack Args struct a bit more efficiently - handle args in alphabetical order	2021-09-21 14:45:49 +02:00
Martin Ankerl	d3c6f8bfa1	bench: introduce -min_time argument When it is not easily possible to stabilize benchmark machine and code the argument -min_time can be used to specify a minimum duration that a benchmark should take. E.g. choose -min_time=1000 if you are willing to wait about 1 second for each benchmark result. The default is now set to 10ms instead of 0, which should make runs on fast machines more stable with negligible slowdown.	2021-09-21 14:45:48 +02:00
Jon Atack	10f4ce2078	bench: bench.h fixes and improvements	2021-06-24 11:13:10 +02:00
Hennadii Stepanov	e99db77a6e	Drop boost/preprocessor dependencies	2021-02-01 22:30:06 +02:00
Martin Ankerl	78c312c983	Replace current benchmarking framework with nanobench This replaces the current benchmarking framework with nanobench [1], an MIT licensed single-header benchmarking library, of which I am the autor. This has in my opinion several advantages, especially on Linux: * fast: Running all benchmarks takes ~6 seconds instead of 4m13s on an Intel i7-8700 CPU @ 3.20GHz. * accurate: I ran e.g. the benchmark for SipHash_32b 10 times and calculate standard deviation / mean = coefficient of variation: * 0.57% CV for old benchmarking framework * 0.20% CV for nanobench So the benchmark results with nanobench seem to vary less than with the old framework. * It automatically determines runtime based on clock precision, no need to specify number of evaluations. * measure instructions, cycles, branches, instructions per cycle, branch misses (only Linux, when performance counters are available) * output in markdown table format. * Warn about unstable environment (frequency scaling, turbo, ...) * For better profiling, it is possible to set the environment variable NANOBENCH_ENDLESS to force endless running of a particular benchmark without the need to recompile. This makes it to e.g. run "perf top" and look at hotspots. Here is an example copy & pasted from the terminal output: \| ns/byte \| byte/s \| err% \| ins/byte \| cyc/byte \| IPC \| bra/byte \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 2.52 \| 396,529,415.94 \| 0.6% \| 25.42 \| 8.02 \| 3.169 \| 0.06 \| 0.0% \| 0.03 \| `bench/crypto_hash.cpp RIPEMD160` \| 1.87 \| 535,161,444.83 \| 0.3% \| 21.36 \| 5.95 \| 3.589 \| 0.06 \| 0.0% \| 0.02 \| `bench/crypto_hash.cpp SHA1` \| 3.22 \| 310,344,174.79 \| 1.1% \| 36.80 \| 10.22 \| 3.601 \| 0.09 \| 0.0% \| 0.04 \| `bench/crypto_hash.cpp SHA256` \| 2.01 \| 496,375,796.23 \| 0.0% \| 18.72 \| 6.43 \| 2.911 \| 0.01 \| 1.0% \| 0.00 \| `bench/crypto_hash.cpp SHA256D64_1024` \| 7.23 \| 138,263,519.35 \| 0.1% \| 82.66 \| 23.11 \| 3.577 \| 1.63 \| 0.1% \| 0.00 \| `bench/crypto_hash.cpp SHA256_32b` \| 3.04 \| 328,780,166.40 \| 0.3% \| 35.82 \| 9.69 \| 3.696 \| 0.03 \| 0.0% \| 0.03 \| `bench/crypto_hash.cpp SHA512` [1] https://github.com/martinus/nanobench * Adds support for asymptotes This adds support to calculate asymptotic complexity of a benchmark. This is similar to #17375, but currently only one asymptote is supported, and I have added support in the benchmark `ComplexMemPool` as an example. Usage is e.g. like this: ``` ./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800 ``` This runs the benchmark `ComplexMemPool` several times but with different complexityN settings. The benchmark can extract that number and use it accordingly. Here, it's used for `childTxs`. The output is this: \| complexityN \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| total \| benchmark \|------------:\|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|----------:\|:---------- \| 25 \| 1,064,241.00 \| 939.64 \| 1.4% \| 3,960,279.00 \| 2,829,708.00 \| 1.400 \| 0.01 \| `ComplexMemPool` \| 50 \| 1,579,530.00 \| 633.10 \| 1.0% \| 6,231,810.00 \| 4,412,674.00 \| 1.412 \| 0.02 \| `ComplexMemPool` \| 100 \| 4,022,774.00 \| 248.58 \| 0.6% \| 16,544,406.00 \| 11,889,535.00 \| 1.392 \| 0.04 \| `ComplexMemPool` \| 200 \| 15,390,986.00 \| 64.97 \| 0.2% \| 63,904,254.00 \| 47,731,705.00 \| 1.339 \| 0.17 \| `ComplexMemPool` \| 400 \| 69,394,711.00 \| 14.41 \| 0.1% \| 272,602,461.00 \| 219,014,691.00 \| 1.245 \| 0.76 \| `ComplexMemPool` \| 600 \| 168,977,165.00 \| 5.92 \| 0.1% \| 639,108,082.00 \| 535,316,887.00 \| 1.194 \| 1.86 \| `ComplexMemPool` \| 800 \| 310,109,077.00 \| 3.22 \| 0.1% \|1,149,134,246.00 \| 984,620,812.00 \| 1.167 \| 3.41 \| `ComplexMemPool` \| coefficient \| err% \| complexity \|--------------:\|-------:\|------------ \| 4.78486e-07 \| 4.5% \| O(n^2) \| 6.38557e-10 \| 21.7% \| O(n^3) \| 3.42338e-05 \| 38.0% \| O(n log n) \| 0.000313914 \| 46.9% \| O(n) \| 0.0129823 \| 114.4% \| O(log n) \| 0.0815055 \| 133.8% \| O(1) The best fitting curve is O(n^2), so the algorithm seems to scale quadratic with `childTxs` in the range 25 to 800.	2020-06-13 12:24:18 +02:00
MarcoFalke	fab1170964	bench: Remove requirement that all benches use RegTestingSetup	2020-04-17 10:19:32 -04:00
MarcoFalke	fac5c37300	scripted-diff: Sort test includes -BEGIN VERIFY SCRIPT- # Mark all lines with #includes sed -i --regexp-extended -e 's/(#include <.*>)/\1 /g' $(git grep -l '#include' ./src/bench/ ./src/test ./src/wallet/test/) # Sort all marked lines git diff -U0 \| ./contrib/devtools/clang-format-diff.py -p1 -i -v -END VERIFY SCRIPT-	2020-04-16 13:32:36 -04:00
MarcoFalke	e09c701e01	scripted-diff: Bump copyright of files changed in 2020 -BEGIN VERIFY SCRIPT- ./contrib/devtools/copyright_header.py update ./ -END VERIFY SCRIPT-	2020-01-15 02:18:00 +07:00
MarcoFalke	17e14ac92f	Merge #17781 : rpc: Remove mempool global from miner `faa92a2297` rpc: Remove mempool global from miner (MarcoFalke) `6666ef13f1` test: Properly document blockinfo size in miner_tests (MarcoFalke) Pull request description: The miner needs read-only access to the mempool. Instead of using the mutable global `::mempool`, keep a immutable reference to a mempool that is passed to the miner. Apart from the obvious benefits of removing a global and making things immutable, this might also simplify testing with multiple mempools. ACKs for top commit: promag: ACK `faa92a2297`. fjahr: ACK `faa92a2297` jnewbery: Code review ACK `faa92a2297` Tree-SHA512: c44027b5d2217a724791166f3f3112c45110ac1dbb37bdae27148a0657e0d1a1d043b0d24e49fd45465ec014224d1b7eb15c92a33069ad883fa8ffeadc24735b	2020-01-02 17:50:56 -05:00
MarcoFalke	aaaaad6ac9	scripted-diff: Bump copyright of files changed in 2019 -BEGIN VERIFY SCRIPT- ./contrib/devtools/copyright_header.py update ./ -END VERIFY SCRIPT-	2019-12-30 10:42:20 +13:00
MarcoFalke	faa92a2297	rpc: Remove mempool global from miner	2019-12-23 06:12:10 +07:00
practicalswift	084e17cebd	Remove unused includes	2019-10-15 22:56:43 +00:00
DrahtBot	eb7daf4d60	Update copyright headers to 2018	2018-07-27 07:15:02 -04:00
Daniel Kraft	60ebc7da4c	trivial: Mark overrides as such. This trivial change adds the "override" keyword to some methods of subclasses meant to override interface methods. This ensures that any future change to the interface' method signatures which are not correctly mirrored in the subclass will break at compile time with a clear error message, rather than fail at runtime (which is harder to debug).	2018-05-20 09:15:39 +02:00
Akira Takizawa	595a7bab23	Increment MIT Licence copyright header year on files modified in 2017	2018-01-03 02:26:56 +09:00
Martin Ankerl	00721e69f8	Improved microbenchmarking with multiple features. * inline performance critical code * Average runtime is specified and used to calculate iterations. * Console: show median of multiple runs * plot: show box plot * filter benchmarks * specify scaling factor * ignore src/test and src/bench in command line check script * number of iterations instead of time * Replaced runtime in BENCHMARK makro number of iterations. * Added -? to bench_bitcoin * Benchmark plotly.js URL, width, height can be customized * Fixed incorrect precision warning	2017-12-23 11:03:17 +01:00
practicalswift	069215ebe2	Initialize recently introduced non-static class member lastCycles to zero in constructor lastCycles was introduced in `3532818746` which was merged into master yesterday. Also initialize beginCycles to zero for consistency and completeness.	2017-11-13 22:37:13 +01:00
Matt Corallo	620bae34cf	Require a steady clock for bench with at least micro precision	2017-11-09 14:36:11 -05:00
Cory Fields	24a0bddf4a	bench: prefer a steady clock if the resolution is no worse	2017-11-07 17:17:34 -05:00
Cory Fields	c515d266ec	bench: switch to std::chrono for time measurements std::chrono removes portability issues. Rather than storing doubles, store the untouched time_points. Then convert to nanoseconds for display. This allows for maximum precision, while keeping results comparable between differing hardware/operating systems. Also, display full nanosecond counts rather than sub-second floats.	2017-11-07 17:15:58 -05:00
Matt Corallo	0b1b9148cd	Remove countMaskInv caching in bench framework We were saving a div by caching the inverse as a float, but this ended up requiring a int -> float -> int conversion, which takes almost as much time as the difference between float mul and div. There are lots of other more pressing issues with the bench framework which probably require simply removing the adaptive iteration count stuff anyway.	2017-09-11 15:51:36 -04:00
practicalswift	1b936f5926	Replace boost::function with std::function (C++11)	2017-05-13 17:59:09 +02:00
practicalswift	dbf30ff10f	[trivial] Fix typos in comments	2017-03-21 19:49:08 +01:00
Wladimir J. van der Laan	29c53289a9	bench: Fix initialization order in registration The initialization order of global data structures in different implementation units is undefined. Making use of this is essentially gambling on what the linker does, the so-called [Static initialization order fiasco](https://isocpp.org/wiki/faq/ctors#static-init-order). In this case it apparently worked on Linux but failed on OpenBSD and FreeBSD. To create it on first use, make the registration structure local to a function. Fixes #8910.	2017-02-07 19:07:29 +01:00
isle2983	27765b6403	Increment MIT Licence copyright header year on files modified in 2016 Edited via: $ contrib/devtools/copyright_header.py update .	2016-12-31 11:01:21 -07:00
Wladimir J. van der Laan	3532818746	bench: Add support for measuring CPU cycles This adds cycle min/max/avg to the statistics. Supported on x86 and x86_64 (natively through rdtsc), as well as Linux (perf syscall).	2016-11-22 12:20:57 +01:00
Gregory Maxwell	63ff57db4b	Avoid integer division in the benchmark inner-most loop. Previously the benchmark code used an integer division (%) with a non-constant in the inner-loop. This is quite slow on many processors, especially ones like ARM that lack a hardware divide. Even on fairly recent x86_64 like haswell an integer division can take something like 100 cycles-- making it comparable to the runtime of siphash. This change avoids the division by using bitmasking instead. This was especially easy since the count was only increased by doubling. This change also restarts the timing when the execution time was very low this avoids mintimes of zero in cases where one execution ends up below the timer resolution. It also reduces the impact of the overhead on the final result. The formatting of the prints is changed to not use scientific notation make it more machine readable (in particular, gnuplot croaks on the non-fixedpoint, and it doesn't sort correctly). This also hoists out all the floating point divisions out of the semi-hot path because it was easy to do so. It might be prudent to break out the critical test into a macro just to guarantee that it gets inlined. It might also make sense to just save out the intermediate counts and times and get the floating point completely out of the timing loop (because e.g. on hardware without a fast hardware FPU like some ARM it will still be slow enough to distort the results). I haven't done either of these in this commit.	2016-05-30 22:07:56 +00:00
Philip Kaufmann	214de7e54c	[Trivial] ensure minimal header conventions - ensure header namespaces and end comments are correct - add missing header end comments - ensure minimal formatting (add newlines etc.)	2015-10-27 17:44:13 +01:00
Gavin Andresen	7072c544b5	Support very-fast-running benchmarks Avoid calling gettimeofday every time through the benchmarking loop, by keeping track of how long each loop takes and doubling the number of iterations done between time checks when they take less than 1/16'th of the total elapsed time.	2015-09-30 09:24:42 -04:00
Gavin Andresen	535ed9223d	Simple benchmarking framework Benchmarking framework, loosely based on google's micro-benchmarking library (https://github.com/google/benchmark) Wny not use the Google Benchmark framework? Because adding Even More Dependencies isn't worth it. If we get a dozen or three benchmarks and need nanosecond-accurate timings of threaded code then switching to the full-blown Google Benchmark library should be considered. The benchmark framework is hard-coded to run each benchmark for one wall-clock second, and then spits out .csv-format timing information to stdout. It is left as an exercise for later (or maybe never) to add command-line arguments to specify which benchmark(s) to run, how long to run them for, how to format results, etc etc etc. Again, see the Google Benchmark framework for where that might end up. See src/bench/MilliSleep.cpp for a sanity-test benchmark that just benchmarks 'sleep 100 milliseconds.' To compile and run benchmarks: cd src; make bench Sample output: Benchmark,count,min,max,average Sleep100ms,10,0.101854,0.105059,0.103881	2015-09-30 09:24:42 -04:00

32 commits