7379a54ec4 bench: Remove incorrect LoadWallet call in WalletBalance (Andrew Chow)
846b2fe67e tests: Move ADDRESS_BCRT1_UNSPENDABLE to wallet/test/util.h (Andrew Chow)
c61d3f02f5 tests, bench: Consolidate {Test,Bench}Un/LoadWallet helper (Andrew Chow)
Pull request description:
I have a few PRs and branches that use these two commits, probably makes sense to split them into a separate PR to be merged sooner.
The first commit contains some things that end up being commonly used in new wallet benchmarks. These are moved into `wallet_common.{h/cpp}`.
The second commit contains a bugfix for the wallet_balance benchmark where it calls `LoadWallet` in the wrong place. It's unnecessary to call that function in this benchmark. Although this does not cause any issues currently, it ends up causing issues in some PRs and branches that I'm working on.
ACKs for top commit:
Sjors:
utACK 7379a54ec4
furszy:
ACK 7379a54
Tree-SHA512: 47773887a16c69ac7121c699d3446a8c399bd792a6a31714998b7b7a19fea179c6d3b29cb898b04397b2962c1b4120d57009352b8460b8283e188d4cb480c9ba
The WalletBalance benchmarks would incorrectly call LoadWallet after the
wallet has been setup. LoadWallet expects to be the first thing that is
called and for the CWallet to be in a fresh state. When it is not, it
results in bogus pointers which can cause segfaults during this
benchmark.
The wallet tests and benchmarks both had helper functions for loading
and unloading the wallet for the test that were almost identical.
These functions are consolidated and reused.
To avoid wasting processing power, we can skip blocks that occurred
before the wallet's creation time, since these blocks are guaranteed
not to contain any relevant wallet data.
This has direct implications (an speed improvement) on the underlying
blockchain synchronization process as well.
The reason is that the validation interface queue is limited to
10 tasks per time. This means that no more than 10 blocks can be
waiting for the wallet(s) to be processed while we are synchronizing
the chain (activating the best chain to be more precise).
Which can be a bottleneck if blocks arrive and are processed faster
from the network than what they are processed by the wallet(s).
Since we have a mockable wallet database, we don't really need to be
using BDB or SQLite's in-memory database capabilities. It doesn't really
help us to be using those as we aren't doing anything that requires one
type of db over the other, and will just prefer SQLite if it's
available.
MockableDatabase is suitable for these uses, so use
CreateMockableWalletDatabase to use that.
-BEGIN VERIFY SCRIPT-
sed -i "s/CreateMockWalletDatabase(options)/CreateMockableWalletDatabase()/" $(git grep -l "CreateMockWalletDatabase(options)" -- ":(exclude)src/wallet/walletdb.*")
sed -i "s/CreateMockWalletDatabase/CreateMockableWalletDatabase/" $(git grep -l "CreateMockWalletDatabase" -- ":(exclude)src/wallet/walletdb.*")
-END VERIFY SCRIPT-
files share the same purpose, and we shouldn't have wallet code
inside the test directory.
This later is needed to use wallet util functions in the bench
and test binaries without be forced to duplicate them.
Followup to commit "MOVEONLY: CWallet transaction code out of
wallet.cpp/.h" that detaches and renames some CWalletTx methods, making
into them into standalone functions or CWallet methods instead.
There are no changes in behavior and no code changes that aren't purely
mechanical. It just gives spend and receive functions more consistent
names and removes the circular dependencies added by the earlier
MOVEONLY commit.
There are also no comment or documentation changes. Removed comments
from transaction.h are just migrated to spend.h, receive.h, and
wallet.h.
This removes a source of complexity and indirection that makes it harder to
understand path checking code. Path checks will be simplified in upcoming
commits.
There is no change in behavior in this commit other than a slightly more
descriptive error message in `loadwallet` if the default "" wallet can't be
found. (The error message is improved more in upcoming commit "wallet: Remove
path checking code from loadwallet RPC".)
78c312c983 Replace current benchmarking framework with nanobench (Martin Ankerl)
Pull request description:
Replace current benchmarking framework with nanobench
This replaces the current benchmarking framework with nanobench [1], an
MIT licensed single-header benchmarking library, of which I am the
autor. This has in my opinion several advantages, especially on Linux:
* fast: Running all benchmarks takes ~6 seconds instead of 4m13s on
an Intel i7-8700 CPU @ 3.20GHz.
* accurate: I ran e.g. the benchmark for SipHash_32b 10 times and
calculate standard deviation / mean = coefficient of variation:
* 0.57% CV for old benchmarking framework
* 0.20% CV for nanobench
So the benchmark results with nanobench seem to vary less than with
the old framework.
* It automatically determines runtime based on clock precision, no need
to specify number of evaluations.
* measure instructions, cycles, branches, instructions per cycle,
branch misses (only Linux, when performance counters are available)
* output in markdown table format.
* Warn about unstable environment (frequency scaling, turbo, ...)
* For better profiling, it is possible to set the environment variable
NANOBENCH_ENDLESS to force endless running of a particular benchmark
without the need to recompile. This makes it to e.g. run "perf top"
and look at hotspots.
Here is an example copy & pasted from the terminal output:
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 2.52 | 396,529,415.94 | 0.6% | 25.42 | 8.02 | 3.169 | 0.06 | 0.0% | 0.03 | `bench/crypto_hash.cpp RIPEMD160`
| 1.87 | 535,161,444.83 | 0.3% | 21.36 | 5.95 | 3.589 | 0.06 | 0.0% | 0.02 | `bench/crypto_hash.cpp SHA1`
| 3.22 | 310,344,174.79 | 1.1% | 36.80 | 10.22 | 3.601 | 0.09 | 0.0% | 0.04 | `bench/crypto_hash.cpp SHA256`
| 2.01 | 496,375,796.23 | 0.0% | 18.72 | 6.43 | 2.911 | 0.01 | 1.0% | 0.00 | `bench/crypto_hash.cpp SHA256D64_1024`
| 7.23 | 138,263,519.35 | 0.1% | 82.66 | 23.11 | 3.577 | 1.63 | 0.1% | 0.00 | `bench/crypto_hash.cpp SHA256_32b`
| 3.04 | 328,780,166.40 | 0.3% | 35.82 | 9.69 | 3.696 | 0.03 | 0.0% | 0.03 | `bench/crypto_hash.cpp SHA512`
[1] https://github.com/martinus/nanobench
ACKs for top commit:
laanwj:
ACK 78c312c983
Tree-SHA512: 9e18770b18b6f95a7d0105a4a5497d31cf4eb5efe6574f4482f6f1b4c88d7e0946b9a4a1e9e8e6ecbf41a3f2d7571240677dcb45af29a6f0584e89b25f32e49e
This replaces the current benchmarking framework with nanobench [1], an
MIT licensed single-header benchmarking library, of which I am the
autor. This has in my opinion several advantages, especially on Linux:
* fast: Running all benchmarks takes ~6 seconds instead of 4m13s on
an Intel i7-8700 CPU @ 3.20GHz.
* accurate: I ran e.g. the benchmark for SipHash_32b 10 times and
calculate standard deviation / mean = coefficient of variation:
* 0.57% CV for old benchmarking framework
* 0.20% CV for nanobench
So the benchmark results with nanobench seem to vary less than with
the old framework.
* It automatically determines runtime based on clock precision, no need
to specify number of evaluations.
* measure instructions, cycles, branches, instructions per cycle,
branch misses (only Linux, when performance counters are available)
* output in markdown table format.
* Warn about unstable environment (frequency scaling, turbo, ...)
* For better profiling, it is possible to set the environment variable
NANOBENCH_ENDLESS to force endless running of a particular benchmark
without the need to recompile. This makes it to e.g. run "perf top"
and look at hotspots.
Here is an example copy & pasted from the terminal output:
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 2.52 | 396,529,415.94 | 0.6% | 25.42 | 8.02 | 3.169 | 0.06 | 0.0% | 0.03 | `bench/crypto_hash.cpp RIPEMD160`
| 1.87 | 535,161,444.83 | 0.3% | 21.36 | 5.95 | 3.589 | 0.06 | 0.0% | 0.02 | `bench/crypto_hash.cpp SHA1`
| 3.22 | 310,344,174.79 | 1.1% | 36.80 | 10.22 | 3.601 | 0.09 | 0.0% | 0.04 | `bench/crypto_hash.cpp SHA256`
| 2.01 | 496,375,796.23 | 0.0% | 18.72 | 6.43 | 2.911 | 0.01 | 1.0% | 0.00 | `bench/crypto_hash.cpp SHA256D64_1024`
| 7.23 | 138,263,519.35 | 0.1% | 82.66 | 23.11 | 3.577 | 1.63 | 0.1% | 0.00 | `bench/crypto_hash.cpp SHA256_32b`
| 3.04 | 328,780,166.40 | 0.3% | 35.82 | 9.69 | 3.696 | 0.03 | 0.0% | 0.03 | `bench/crypto_hash.cpp SHA512`
[1] https://github.com/martinus/nanobench
* Adds support for asymptotes
This adds support to calculate asymptotic complexity of a benchmark.
This is similar to #17375, but currently only one asymptote is
supported, and I have added support in the benchmark `ComplexMemPool`
as an example.
Usage is e.g. like this:
```
./bench_bitcoin -filter=ComplexMemPool -asymptote=25,50,100,200,400,600,800
```
This runs the benchmark `ComplexMemPool` several times but with
different complexityN settings. The benchmark can extract that number
and use it accordingly. Here, it's used for `childTxs`. The output is
this:
| complexityN | ns/op | op/s | err% | ins/op | cyc/op | IPC | total | benchmark
|------------:|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|----------:|:----------
| 25 | 1,064,241.00 | 939.64 | 1.4% | 3,960,279.00 | 2,829,708.00 | 1.400 | 0.01 | `ComplexMemPool`
| 50 | 1,579,530.00 | 633.10 | 1.0% | 6,231,810.00 | 4,412,674.00 | 1.412 | 0.02 | `ComplexMemPool`
| 100 | 4,022,774.00 | 248.58 | 0.6% | 16,544,406.00 | 11,889,535.00 | 1.392 | 0.04 | `ComplexMemPool`
| 200 | 15,390,986.00 | 64.97 | 0.2% | 63,904,254.00 | 47,731,705.00 | 1.339 | 0.17 | `ComplexMemPool`
| 400 | 69,394,711.00 | 14.41 | 0.1% | 272,602,461.00 | 219,014,691.00 | 1.245 | 0.76 | `ComplexMemPool`
| 600 | 168,977,165.00 | 5.92 | 0.1% | 639,108,082.00 | 535,316,887.00 | 1.194 | 1.86 | `ComplexMemPool`
| 800 | 310,109,077.00 | 3.22 | 0.1% |1,149,134,246.00 | 984,620,812.00 | 1.167 | 3.41 | `ComplexMemPool`
| coefficient | err% | complexity
|--------------:|-------:|------------
| 4.78486e-07 | 4.5% | O(n^2)
| 6.38557e-10 | 21.7% | O(n^3)
| 3.42338e-05 | 38.0% | O(n log n)
| 0.000313914 | 46.9% | O(n)
| 0.0129823 | 114.4% | O(log n)
| 0.0815055 | 133.8% | O(1)
The best fitting curve is O(n^2), so the algorithm seems to scale
quadratic with `childTxs` in the range 25 to 800.
Currently it's possible for ReleaseWallet to delete the CWallet pointer while
it is processing BlockConnected, etc chain notifications.
To fix this, unregister from notifications earlier in UnloadWallet instead of
ReleaseWallet, and use a new RegisterSharedValidationInterface function to
prevent the CValidationInterface shared_ptr from being deleted until the last
notification is actually finished.