The block index (CBlockTreeDB) is required to write and read blocks, so
move it to blockstorage. This allows to drop the txdb.h include from
`node/blockstorage.h`.
Can be reviewed with:
--color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
bee2d57a65 script: update flake8 to 6.1.0 (Jon Atack)
38c3fd846b test: python E721 updates (Jon Atack)
Pull request description:
Update our functional tests per [E721](https://www.flake8rules.com/rules/E721.html) enforced by [flake8 6.1.0](https://flake8.pycqa.org/en/latest/release-notes/6.1.0.html), and update our CI lint task to use that release. This makes the following linter output on current master with flake8 6.1.0 green.
```
$ ./test/lint/lint-python.py ; ./test/lint/lint-spelling.py
test/functional/p2p_invalid_locator.py:35:16: E721 do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()`
test/functional/test_framework/siphash.py:34:12: E721 do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()`
test/functional/test_framework/siphash.py:64:12: E721 do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()`
src/test/fuzz/descriptor_parse.cpp:88: occurences ==> occurrences
^ Warning: codespell identified likely spelling errors. Any false positives? Add them to the list of ignored words in test/lint/spelling.ignore-words.txt
```
ACKs for top commit:
MarcoFalke:
lgtm ACK bee2d57a65
Tree-SHA512: f3788a543ca98e44eeeba1d06c32f1b11eec95d4aef068aa1b6b5c401261adfa3fb6c6d6c769f3fe6839d78e74a310d5c926867e7c367d6513a53d580fd376f3
a733dd79e2 Remove unused function `reliesOnAssumedValid` (Suhas Daftuar)
d4a11abb19 Cache block index entry corresponding to assumeutxo snapshot base blockhash (Suhas Daftuar)
3556b85022 Move CheckBlockIndex() from Chainstate to ChainstateManager (Suhas Daftuar)
0ce805b632 Documentation improvements for assumeutxo (Ryan Ofsky)
768690b7ce Fix initialization of setBlockIndexCandidates when working with multiple chainstates (Suhas Daftuar)
d43a1f1a2f Tighten requirements for adding elements to setBlockIndexCandidates (Suhas Daftuar)
d0d40ea9a6 Move block-storage-related logic to ChainstateManager (Suhas Daftuar)
3cfc75366e test: Clear block index flags when testing snapshots (Suhas Daftuar)
272fbc370c Update CheckBlockIndex invariants for chains based on an assumeutxo snapshot (Suhas Daftuar)
10c05710ce Add wrapper for adding entries to a chainstate's block index candidates (Suhas Daftuar)
471da5f6e7 Move block-arrival information / preciousblock counters to ChainstateManager (Suhas Daftuar)
1cfc887d00 Remove CChain dependency in node/blockstorage (Suhas Daftuar)
fe86a7cd48 Explicitly track maximum block height stored in undo files (Suhas Daftuar)
Pull request description:
This PR proposes a clean up of the relationship between block storage and the chainstate objects, by moving the decision of whether to store a block on disk to something that is not chainstate-specific. Philosophically, the decision of whether to store a block on disk is related to validation rules that do not require any UTXO state; for anti-DoS reasons we were using some chainstate-specific heuristics, and those have been reworked here to achieve the proposed separation.
This PR also fixes a bug in how a chainstate's `setBlockIndexCandidates` was being initialized; it should always have all the HAVE_DATA block index entries that have more work than the chain tip. During startup, we were not fully populating `setBlockIndexCandidates` in some scenarios involving multiple chainstates.
Further, this PR establishes a concept that whenever we have 2 chainstates, that we always know the snapshotted chain's base block and the base block's hash must be an element of our block index. Given that, we can establish a new invariant that the background validation chainstate only needs to consider blocks leading to that snapshotted block entry as potential candidates for its tip. As a followup I would imagine that when writing net_processing logic to download blocks for the background chainstate, that we would use this concept to only download blocks towards the snapshotted entry as well.
ACKs for top commit:
achow101:
ACK a733dd79e2
jamesob:
reACK a733dd79e2 ([`jamesob/ackr/27746.5.sdaftuar.rework_validation_logic`](https://github.com/jamesob/bitcoin/tree/ackr/27746.5.sdaftuar.rework_validation_logic))
Sjors:
Code review ACK a733dd79e2.
ryanofsky:
Code review ACK a733dd79e2. Just suggested changes since the last review. There are various small things that could be followed up on, but I think this is ready for merge.
Tree-SHA512: 9ec17746f22b9c27082743ee581b8adceb2bd322fceafa507b428bdcc3ffb8b4c6601fc61cc7bb1161f890c3d38503e8b49474da7b5ab1b1f38bda7aa8668675
ecfe507e07 fuzz: use `ConnmanTestMsg` in `connman` (brunoerg)
Pull request description:
Fixes #27980
Using `ConnmanTestMsg` we can add nodes and be
more effective fuzzing functions like `DisconnectNode`,
`FindNode`, `GetNodeStats` and other ones.
ACKs for top commit:
MarcoFalke:
review ACK ecfe507e07
dergoegge:
utACK ecfe507e07
Tree-SHA512: 97c363b422809f2e9755c082d1102237347abfab72c7baca417bd8975f8a595ddf3a085f8353dbdb9f17fb98fbfe830792bfc0b83451168458018faf6c239efa
fabef121b0 refactor: Use EnsureAnyNodeContext (MarcoFalke)
fa1640617e test: Add SyncWithValidationInterfaceQueue to mockscheduler RPC (MarcoFalke)
Pull request description:
There should be no risk or downside in adding a call to `SyncWithValidationInterfaceQueue` here. In fact, it will make tests less brittle. For example,
* If one sets the timeouts in `test/functional/feature_fee_estimation.py` to `0`, on `master` the test will fail and here it will pass.
* It may avoid a rare (theoretic) intermittent issue in https://github.com/bitcoin/bitcoin/pull/28108/files#r1268966663
ACKs for top commit:
TheCharlatan:
ACK fabef121b0
furszy:
Code review ACK fabef121. Convinced by checking all current tests usages.
Tree-SHA512: c9e9a536a8721d1b3f267a66b40578b34948892301affdcad121ef8e02bf17037305d0dd53aa94b1b064753e66f9cfb31823b916b707a9d812627f502b818003
06199a995f refactor: Revert addition of univalue sighash string check (TheCharlatan)
0b47c16215 doc: Correct release-notes for sighashtype exceptions (TheCharlatan)
Pull request description:
This is a follow up for #28113.
The string type check is already done by the rpc parser / RPCHelpMan. Re-doing it is adding dead code. Instead, throwing an exception when the assumption does not hold is the already correct behavior. Pointed out in this [comment](https://github.com/bitcoin/bitcoin/pull/28113/files#r1274568557).
Also correct the release note for the correct sighashtype exception change. There is no change in the handling of non-string sighashtype arugments. Pointed out in this [comment](https://github.com/bitcoin/bitcoin/pull/28113/files#r1274567555).
ACKs for top commit:
MarcoFalke:
lgtm ACK 06199a995f
jonatack:
Tested ACK 06199a995f
stickies-v:
ACK 06199a995f
Tree-SHA512: 3faa6b3d2247624c0973df8d79c09fbf1f90ffb99f1be484e359b528f485c31affea45976759bd206e4c81cbb54ebba5ad0ef4127d1deacbfe2a58153fcc94ee
fa940f41ea Remove unused raw-pointer read helper from univalue (MarcoFalke)
Pull request description:
The helpers are unused outside of tests and redundant with the existing `bool read(std::string_view raw);`.
Fix both issues by removing them.
Also, simplify the tests code by removing a `std::string` constructor where possible.
ACKs for top commit:
stickies-v:
utACK fa940f41ea
TheCharlatan:
tACK fa940f41ea
Tree-SHA512: 60c154c1046f01551335af79bf820a6104844f63e89977271b4336b3cd06ac3bab1379e18b7bc61d12bef7446029e91c16541ddecf9e88bc8bc897fc1f6ee2c8
131314b62e fuzz: increase coverage of the descriptor targets (Antoine Poinsot)
90a24741e7 fuzz: add a new, more efficient, descriptor parsing target (Antoine Poinsot)
d60229ede5 fuzz: make the parsed descriptor testing into a function (Antoine Poinsot)
Pull request description:
The current descriptor parsing fuzz target requires valid public or private keys to be provided. This is unnecessary as we are only interested in fuzzing the descriptor parsing logic here (other targets are focused on fuzzing keys serializations). And it's pretty inefficient, especially for formats that need a checksum (`xpub`, `xprv`, WIF).
This introduces a new target that mocks the keys as an index in a list of precomputed keys. Keys are represented as 2 hex characters in the descriptor. The key type (private, public, extended, ..) is deterministically based on this one-byte value. Keys are deterministically generated at target initialization. This is much more efficient and also largely reduces the size of the seeds.
TL;DR: for instance instead of requiring the fuzzer to generate a `pk(xpub6DdBu7pBoyf7RjnUVhg8y6LFCfca2QAGJ39FcsgXM52Pg7eejUHLBJn4gNMey5dacyt4AjvKzdTQiuLfRdK8rSzyqZPJmNAcYZ9kVVEz4kj)` to parse a valid descriptor, it just needs to generate a `pk(03)`.
Note we only mock the keys themselves, not the entire descriptor key expression. As we want to fuzz the real code that parses the rest of the key expression (origin, derivation paths, ..).
This is a target i used for reviewing #17190 and #27255, and figured it was worth PR'ing on its own since the added complexity for mocking the keys is minimal and it could help prevent introducing bugs to the descriptor parsing logic much more efficiently.
ACKs for top commit:
MarcoFalke:
re-ACK 131314b62e🐓
achow101:
ACK 131314b62e
Tree-SHA512: 485a8d6a0f31a3a132df94dc57f97bdd81583d63507510debaac6a41dbbb42fa83c704ff3f2bd0b78c8673c583157c9a3efd79410e5e79511859e1470e629118
This check is already done by the rpc parser. Re-doing it is adding dead
code. Instead, throwing an exception when the assumption does not hold
is the already correct behavior.
To make the fuzz test more accurate and not swallow all runtime errors,
add a check that the passed in UniValue sighash argument is either a
string or null.
Co-authored-by: stickies-v <stickies-v@protonmail.com>
fa9108f85a refactor: Use reinterpret_cast where appropriate (MarcoFalke)
3333f950d4 refactor: Avoid casting away constness (MarcoFalke)
fa6394dd10 refactor: Remove unused C-style casts (MarcoFalke)
Pull request description:
Using a C-style cast to convert pointer types to a byte-like pointer type has many issues:
* It may accidentally and silently throw away `const`.
* It forces reviewers to check that it doesn't accidentally throw away `const`.
For example, on current master a `const char*` is cast to `unsigned char*` (without `const`), see d23fda0584/src/span.h (L273) . This can lead to UB, and the only reason why it didn't lead to UB is because the return type added back the `const`. (Obviously this would break if the return type was deduced via `auto`)
Fix all issues by adding back the `const` and using `reinterpret_cast` where appropriate.
ACKs for top commit:
darosior:
re-utACK fa9108f85a
hebasto:
re-ACK fa9108f85a.
john-moffett:
ACK fa9108f85a
Tree-SHA512: 87f6e4b574f9bd96d4e0f2a0631fd0a9dc6096e5d4f1b95042fe9f197afc2fe9a24e333aeb34fed11feefcdb184a238fe1ea5aff10d580bb18d76bfe48b76a10
faca9a3d5a test: Avoid intermittent issues due to async events in validationinterface_tests (MarcoFalke)
Pull request description:
Currently the tests have many issues:
* They setup the genesis block, even though it is not needed
* They queue an async `UpdatedBlockTip` even, which causes intermittent issues: https://github.com/bitcoin/bitcoin/issues/28146#issuecomment-1650064645
Fix all issues by trimming down the setup to just `ChainTestingSetup`.
ACKs for top commit:
Crypt-iQ:
tACK faca9a3d5a
Tree-SHA512: 4449040330f89bbaf5ce5b2052417c160b451c373987fdf1069596c07834ed81f0aea1506d53c7d2cd21062b27332d30679285dae194b272fd0cb9ce5ded32cf
07c59eda00 Don't derive secure_allocator from std::allocator (Casey Carter)
Pull request description:
Giving the C++ Standard Committee control of the public interface of your type means they will break it. C++23 adds a new `allocate_at_least` member to `std::allocator`. Very bad things happen when, say, `std::vector` uses `allocate_at_least` from `secure_allocator`'s base to allocate memory which it then tries to free with `secure_allocator::deallocate`.
(Discovered by microsoft/STL#3712, which will be reverted by microsoft/STL#3819 before it ships.)
ACKs for top commit:
jonatack:
re-ACK 07c59eda00 no change since my previous ACK apart from squashing the commits
achow101:
ACK 07c59eda00
john-moffett:
ACK 07c59eda00 Reviewed and tested. Performance appears unaffected in my environment.
Tree-SHA512: 23606c40414d325f5605a9244d4dd50907fdf5f2fbf70f336accb3a2cb98baa8acd2972f46eab1b7fdec1d28a843a96b06083cd2d09791cda7c90ee218e5bbd5
6960c81cbf kernel: Remove Univalue from kernel library (TheCharlatan)
10eb3a9faa kernel: Split ParseSighashString (TheCharlatan)
Pull request description:
Besides the build system changes, this is a mostly move-only change for moving the few UniValue-related functions out of kernel files.
UniValue is not required by any of the kernel components and a JSON library should not need to be part of a consensus library.
ACKs for top commit:
achow101:
ACK 6960c81cbf
theuni:
Re-ACK 6960c81cbf
stickies-v:
re-ACK 6960c81cbf
Tree-SHA512: d92e4cb4e12134c94b517751bd746d39f9b8da528ec3a1c94aaedcce93274a3bae9277832e8a7c0243c13df0397ca70ae7bbb24ede200018c569f8d81103c1da
Affects both secure_allocator and zero_after_free_allocator.
Giving the C++ Standard Committee control of the public interface of your type means they will break it. C++23 adds a new `allocate_at_least` member to `std::allocator`. Very bad things happen when, say, `std::vector` uses `allocate_at_least` from `secure_allocator`'s base to allocate memory which it then tries to free with `secure_allocator::deallocate`.
Drive-by: Aggressively remove facilities unnecessary since C++11 from both allocators to keep things simple.
This is to (a) avoid repeated lookups into the block index for an entry that
should never change and (b) emphasize that the snapshot base should always
exist when set and not change during the runtime of the program.
Thanks to Russ Yanofsky for suggesting this approach.
Also rewrite CheckBlockIndex() to perform tests on all chainstates.
This increases sanity-check coverage, as any place in our code where we were
invoke CheckBlockIndex() on a single chainstate will now invoke the sanity
checks on all chainstates.
This change also tightens up the checks on setBlockIndexCandidates and
mapBlocksUnlinked, to more precisely match what we aim for even in the presence
of assumed-valid blocks.
When using assumeutxo and multiple chainstates are active, the background
chainstate should consider all HAVE_DATA blocks that are ancestors of the
snapshotted block and that have more work than the tip as potential candidates.
When using assumeutxo, we only need the background chainstate to consider
blocks that are on the chain leading to the snapshotted block.
Note that this introduces the new invariant that we can only have an assumeutxo
snapshot where the snapshotted blockhash is in our block index. Unknown block
hashes that are somehow passed in will cause assertion failures when processing
new blocks.
Includes test fixes and improvements by Andrew Chow and Fabian Jahr.
Once a descriptor is successfully parsed, execute more of its methods.
There is probably still room for improvements by checking for some
invariants, but this is a low hanging fruit that significantly increases
the code coverage of these targets.
This new target focuses on fuzzing the actual descriptor parsing logic
by not requiring the fuzzer to produce valid keys (nor a valid checksum
for that matter).
This should make it much more efficient to find bugs we could introduce
moving forward.
Using a character as a marker (here '%') to be able to search and
replace in the string without having to mock the actual descriptor
parsing logic was an insight from Pieter Wuille.
Separate the notion of which blocks are stored on disk, and what data is in our
block index, from what tip a chainstate might be able to get to. We can use
chainstate-agnostic data to determine when to store a block on disk (primarily,
an anti-DoS set of criteria) and let the chainstates figure out for themselves
when a block is of interest for being a candidate tip.
Note: some of the invariants in CheckBlockIndex are modified, but more work is
needed (ie to move CheckBlockIndex to ChainstateManager, as most of what
CheckBlockIndex is doing is checking the consistency of the block index, which
is outside of Chainstate).
node_context is never null, but if it was, it would lead to a nullptr
dereference in node_context->scheduler. Just use EnsureAnyNodeContext
everywhere for more robust, consistent, and correct code.
dd9633b516 test: wallet, add coverage for watch-only raw sh script migration (furszy)
cc781a2180 descriptor: InferScript, do not return top-level only func as sub descriptor (furszy)
286e0c7d5e wallet: loading, log descriptor parsing error details (furszy)
Pull request description:
Linked to #28057.
Currently, the `InferScript` function returns an invalid descriptor when it tries to infer a p2sh-p2pkh script whose pubkey is not known by the wallet.
This behavior occurs because the inference process bypasses the `pkh` subscript when the pubkey is not contained by the wallet (no pubkey provider), interpreting it as a `sh(addr(ADDR))` descriptor. Then, the failure arises because the `addr()` function is restricted to being used only at the top level.
For reviewers, would recommend to start by examining the functional test to understand the context and the circumstances on which this can result in a fatal error (e.g. during the migration process).
ACKs for top commit:
achow101:
ACK dd9633b516
darosior:
utACK dd9633b516
Tree-SHA512: 61e763206c604c372019d2c36e31684f3dddf81f8b154eb9aba5cd66d8d61bda457ed4e591613eb6ce6c76cf7c3f11764abc6cd727a7c2b6414f1065783be032
e.g. sh(addr(ADDR)) or sh(raw(HEX)) are invalid descriptors.
Making sh and wsh top level functions to return addr/raw descriptors when
the subscript inference fails.