0582932260 test: add test for fast rescan using block filters (top-up detection) (Sebastian Falbesoner)
ca48a4694f rpc: doc: mention rescan speedup using `blockfilterindex=1` in affected wallet RPCs (Sebastian Falbesoner)
3449880b49 wallet: fast rescan: show log message for every non-skipped block (Sebastian Falbesoner)
935c6c4b23 wallet: take use of `FastWalletRescanFilter` (Sebastian Falbesoner)
70b3513904 wallet: add `FastWalletRescanFilter` class for speeding up rescans (Sebastian Falbesoner)
c051026586 wallet: add method for retrieving the end range for a ScriptPubKeyMan (Sebastian Falbesoner)
845279132b wallet: support fetching scriptPubKeys with minimum descriptor range index (Sebastian Falbesoner)
088e38d3bb add chain interface methods for using BIP 157 block filters (Sebastian Falbesoner)
Pull request description:
## Description
This PR is another take of using BIP 157 block filters (enabled by `-blockfilterindex=1`) for faster wallet rescans and is a modern revival of #15845. For reviewers new to this topic I can highly recommend to read the corresponding PR review club (https://bitcoincore.reviews/15845).
The basic idea is to skip blocks for deeper inspection (i.e. looking at every single tx for matches) if our block filter doesn't match any of the block's spent or created UTXOs are relevant for our wallet. Note that there can be false-positives (see https://bitcoincore.reviews/15845#l-199 for a PR review club discussion about false-positive rates), but no false-negatives, i.e. it is safe to skip blocks if the filter doesn't match; if the filter *does* match even though there are no wallet-relevant txs in the block, no harm is done, only a little more time is spent extra.
In contrast to #15845, this solution only supports descriptor wallets, which are way more widespread now than back in the time >3 years ago. With that approach, we don't have to ever derive the relevant scriptPubKeys ourselves from keys before populating the filter, and can instead shift the full responsibility to that to the `DescriptorScriptPubKeyMan` which already takes care of that automatically. Compared to legacy wallets, the `IsMine` logic for descriptor wallets is as trivial as checking if a scriptPubKey is included in the ScriptPubKeyMan's set of scriptPubKeys (`m_map_script_pub_keys`): e191fac4f3/src/wallet/scriptpubkeyman.cpp (L1703-L1710)
One of the unaddressed issues of #15845 was that [the filter was only created once outside the loop](https://github.com/bitcoin/bitcoin/pull/15845#discussion_r343265997) and as such didn't take into account possible top-ups that have happened. This is solved here by keeping a state of ranged `DescriptorScriptPubKeyMan`'s descriptor end ranges and check at each iteration whether that range has increased since last time. If yes, we update the filter with all scriptPubKeys that have been added since the last filter update with a range index equal or higher than the last end range. Note that finding new scriptPubKeys could be made more efficient than linearly iterating through the whole `m_script_pub_keys` map (e.g. by introducing a bidirectional map), but this would mean introducing additional complexity and state and it's probably not worth it at this time, considering that the performance gain is already significant.
Output scripts from non-ranged `DescriptorScriptPubKeyMan`s (i.e. ones with a fixed set of output scripts that is never extended) are added only once when the filter is created first.
## Benchmark results
Obviously, the speed-up indirectly correlates with the wallet tx frequency in the scanned range: the more blocks contain wallet-related transactions, the less blocks can be skipped due to block filter detection.
In a [simple benchmark](https://github.com/theStack/bitcoin/blob/fast_rescan_functional_test_benchmark/test/functional/pr25957_benchmark.py), a regtest chain with 1008 blocks (corresponding to 1 week) is mined with 20000 scriptPubKeys contained (25 txs * 800 outputs) each. The blocks each have a weight of ~2500000 WUs and hence are about 62.5% full. A global constant `WALLET_TX_BLOCK_FREQUENCY` defines how often wallet-related txs are included in a block. The created descriptor wallet (default setting of `keypool=1000`, we have 8*1000 = 8000 scriptPubKeys at the start) is backuped via the `backupwallet` RPC before the mining starts and imported via `restorewallet` RPC after. The measured time for taking this import process (which involves a rescan) once with block filters (`-blockfilterindex=1`) and once without block filters (`-blockfilterindex=0`) yield the relevant result numbers for the benchmark.
The following table lists the results, sorted from worst-case (all blocks contain wallte-relevant txs, 0% can be skipped) to best-case (no blocks contain walltet-relevant txs, 100% can be skipped) where the frequencies have been picked arbitrarily:
wallet-related tx frequency; 1 tx per... | ratio of irrelevant blocks | w/o filters | with filters | speed gain
--------------------------------------------|-----------------------------|-------------|--------------|-------------
~ 10 minutes (every block) | 0% | 56.806s | 63.554s | ~0.9x
~ 20 minutes (every 2nd block) | 50% (1/2) | 58.896s | 36.076s | ~1.6x
~ 30 minutes (every 3rd block) | 66.67% (2/3) | 56.781s | 25.430s | ~2.2x
~ 1 hour (every 6th block) | 83.33% (5/6) | 58.193s | 15.786s | ~3.7x
~ 6 hours (every 36th block) | 97.22% (35/36) | 57.500s | 6.935s | ~8.3x
~ 1 day (every 144th block) | 99.31% (143/144) | 68.881s | 6.107s | ~11.3x
(no txs) | 100% | 58.529s | 5.630s | ~10.4x
Since even the (rather unrealistic) worst-case scenario of having wallet-related txs in _every_ block of the rescan range obviously doesn't take significantly longer, I'd argue it's reasonable to always take advantage of block filters if they are available and there's no need to provide an option for the user.
Feedback about the general approach (but also about details like naming, where I struggled a lot) would be greatly appreciated. Thanks fly out to furszy for discussing this subject and patiently answering basic question about descriptor wallets!
ACKs for top commit:
achow101:
ACK 0582932260
Sjors:
re-utACK 0582932260
aureleoules:
ACK 0582932260 - minor changes, documentation and updated test since last review
w0xlt:
re-ACK 0582932260
Tree-SHA512: 3289ba6e4572726e915d19f3e8b251d12a4cec8c96d041589956c484b5575e3708b14f6e1e121b05fe98aff1c8724de4564a5a9123f876967d33343cbef242e1