d8b12a75db rpc: Allow named and positional arguments to be used together (Ryan Ofsky)
Pull request description:
It's nice to be able to use named options and positional arguments together.
Most shell tools accept both, and python functions combine options and arguments allowing them to be passed with even more flexibility. This change adds support for python's approach so as a motivating example:
```sh
bitcoin-cli -named createwallet wallet_name=mywallet load_on_startup=1
```
Can be shortened to:
```sh
bitcoin-cli -named createwallet mywallet load_on_startup=1
```
JSON-RPC standard doesn't have a convention for passing named and positional parameters together, so this implementation makes one up and interprets any unused `"args"` named parameter as a positional parameter array.
This change is backwards compatible. It doesn't change the interpretation of any previously valid calls, just treats some previously invalid calls as valid.
Another use case even if you only occasionally use named arguments is that you can define an alias:
```
alias bcli='bitcoin-cli -named'
```
And now use both named named and unnamed arguments from the same alias without having to manually add `-named` option for named arguments or see annoying error "No '=' in named argument... this needs to be present for every argument (even if it is empty)`" for unnamed arguments
ACKs for top commit:
achow101:
ACK d8b12a75db
stickies-v:
re-ACK d8b12a75d
aureleoules:
re-ACK d8b12a75db
Tree-SHA512: 0cff8b50f584bcbbd376624adccf40536566ed8d1bcd6c88ad565dbc208f19d5e7a48c994efd6329d42b560149340d330397278f08a2912af5f3418d8c8837a9
7082ce3e88 scripted-diff: rename and de-globalise g_cs_orphans (Anthony Towns)
733d85f79c Move all g_cs_orphans locking to txorphanage (Anthony Towns)
a936f41a5d txorphanage: make m_peer_work_set private (Anthony Towns)
3614819864 txorphange: move orphan workset to txorphanage (Anthony Towns)
6f8e442ba6 net_processing: Localise orphan_work_set handling to ProcessOrphanTx (Anthony Towns)
0027174b39 net_processing: move ProcessOrphanTx docs to declaration (Anthony Towns)
9910ed755c net_processing: Pass a Peer& to ProcessOrphanTx (Anthony Towns)
89e2e0da0b net_processing: move extra transactions to msgproc mutex (Anthony Towns)
ff8d44d196 Remove unnecessary includes of txorphange.h (Anthony Towns)
Pull request description:
Moves extra transactions to be under the `m_msgproc_mutex` lock rather than `g_cs_orphans` and refactors orphan handling so that the lock can be internal to the `TxOrphange` class.
ACKs for top commit:
dergoegge:
Code review ACK 7082ce3e88
glozow:
ACK 7082ce3e88 via code review and some [basic testing](https://github.com/glozow/bitcoin/blob/review-26295/src/test/orphanage_tests.cpp#L150). I think putting txorphanage in charge of handling peer work sets is the right direction.
Tree-SHA512: 1ec454c3a69ebd45ff652770d6a55c6b183db71aba4d12639ed70f525f0035e069a81d06e9b65b66e87929c607080a1c5e5dcd2ca91eaa2cf202dc6c02aa6818
b89530483d util: move threadinterrupt into util (fanquake)
Pull request description:
Alongside thread and threadnames. It's part of libbitcoin_util.
ACKs for top commit:
ryanofsky:
Code review ACK b89530483d. No changes since last review other than rebase
theuni:
ACK b89530483d.
Tree-SHA512: 0421f4d1881ec295272446804b27d16bf63e6b62b272f8bb52bfecde9ae6605e8109ed16294690d3e3ce4b15cc5e7c4046f99442df73adb10bdf069d3fb165aa
fa2d01470a test: Use type-safe NodeSeconds for TestMemPoolEntryHelper (MacroFake)
Pull request description:
test-only refactor to drop the deprecated `GetTime` in favour of the type-safe alternative
ACKs for top commit:
aureleoules:
ACK fa2d01470a - verified that there is no behavior change
Tree-SHA512: 5b64dae19c7bba9e8d90377c85891bc86f60ffbe67ea28d5ed3bd38f6dc30d3fbfba00bf49a16792922bddf83a52c632b6e5e5d8ffe1619fd9bf63effc60d59a
files share the same purpose, and we shouldn't have wallet code
inside the test directory.
This later is needed to use wallet util functions in the bench
and test binaries without be forced to duplicate them.
0eeb9b0442 [fuzz] Move ConsumeNetAddr to fuzz/util/net.h (dergoegge)
291c8697d4 [fuzz] Make ConsumeNetAddr produce valid onion addresses (dergoegge)
c9ba3f836e [netaddress] Make OnionToString public (dergoegge)
Pull request description:
The chance that the fuzzer is able to guess a valid onion address is probably slim, as they are Base32 encoded and include a checksum. Right now, any target using `ConsumeNetAddr` would have a hard time uncovering bugs that require valid onion addresses as input.
This PR makes `ConsumeNetAddr` produce valid onion addresses by using the 32 bytes given by the fuzzer as the pubkey for the onion address and forming a valid address according to the torv3 spec.
ACKs for top commit:
vasild:
ACK 0eeb9b0442
brunoerg:
ACK 0eeb9b0442
Tree-SHA512: 7c687a4d12f9659559be8f0c3cd4265167d1261d419cfd3d503fd7c7f207cc0db745220f02fb1737e4a5700ea7429311cfc0b42e6c15968ce6a85f8813c7e1d8
c8dc0e3eaa refactor: Inline `CTxMemPoolEntry` class's functions (Hennadii Stepanov)
75bbe594e5 refactor: Move `CTxMemPoolEntry` class to its own module (Hennadii Stepanov)
Pull request description:
This PR:
- gets rid of the `policy/fees` -> `txmempool` -> `policy/fees` circular dependency
- is an alternative to #13949, which nukes only one circular dependency
ACKs for top commit:
ryanofsky:
Code review ACK c8dc0e3eaa. Just include and whitespace changes since last review, and there's a moveonly commit now so it's very easy to review
theStack:
Code-review ACK c8dc0e3eaa
glozow:
utACK c8dc0e3eaa, agree these changes are an improvement.
Tree-SHA512: 36ece824e6ed3ab1a1e198b30a906c8ac12de24545f840eb046958a17315ac9260c7de26e11e2fbab7208adc3d74918db7a7e389444130f8810548ca2e81af41
db929893ef Faster -reindex by initially deserializing only headers (Larry Ruane)
c72de9990a util: add CBufferedFile::SkipTo() to move ahead in the stream (Larry Ruane)
48a68908ba Add LoadExternalBlockFile() benchmark (Larry Ruane)
Pull request description:
### Background
During the first part of reindexing, `LoadExternalBlockFile()` sequentially reads raw blocks from the `blocks/blk00nnn.dat` files (rather than receiving them from peers, as with initial block download) and eventually adds all of them to the block index. When an individual block is initially read, it can't be immediately added unless all its ancestors have been added, which is rare (only about 8% of the time), because the blocks are not sorted by height. When the block can't be immediately added to the block index, its disk location is saved in a map so it can be added later. When its parent is later added to the block index, `LoadExternalBlockFile()` reads and deserializes the block from disk a second time and adds it to the block index. Most blocks (92%) get deserialized twice.
### This PR
During the initial read, it's rarely useful to deserialize the entire block; only the header is needed to determine if the block can be added to the block index immediately. This change to `LoadExternalBlockFile()` initially deserializes only a block's header, then deserializes the entire block only if it can be added immediately. This reduces reindex time on mainnet by 7 hours on a Raspberry Pi, which translates to around a 25% reduction in the first part of reindexing (adding blocks to the index), and about a 6% reduction in overall reindex time.
Summary: The performance gain is the result of deserializing each block only once, except its header which is deserialized twice, but the header is only 80 bytes.
ACKs for top commit:
andrewtoth:
ACK db929893ef
achow101:
ACK db929893ef
aureleoules:
ACK db929893ef - minor changes and new benchmark since last review
theStack:
re-ACK db929893ef
stickies-v:
re-ACK db929893e
Tree-SHA512: 5a5377192c11edb5b662e18f511c9beb8f250bc88aeadf2f404c92c3232a7617bade50477ebf16c0602b9bd3b68306d3ee7615de58acfd8cae664d28bb7b0136
It's nice to be able to use named options and positional arguments together.
Most shell tools accept both, and python functions combine options and
arguments allowing them to be passed with even more flexibility. This change
adds support for python's approach so as a motivating example:
bitcoin-cli -named createwallet wallet_name=mywallet load_on_startup=1
Can be shortened to:
bitcoin-cli -named createwallet mywallet load_on_startup=1
JSON-RPC standard doesn't have a convention for passing named and positional
parameters together, so this implementation makes one up and interprets any
unused "args" named parameter as a positional parameter array.
fa3ea81c3e refactor: Add LIFETIMEBOUND / -Wdangling-gsl to Assert() (MacroFake)
Pull request description:
Currently compiles clean, but I think it may still be useful.
Can be tested by adding an `&`:
```diff
diff --git a/src/test/util_tests.cpp b/src/test/util_tests.cpp
index 5766fff92d..300c1ec60f 100644
--- a/src/test/util_tests.cpp
+++ b/src/test/util_tests.cpp
@@ -125,7 +125,7 @@ BOOST_AUTO_TEST_CASE(util_check)
// Check -Wdangling-gsl does not trigger when copying the int. (It would
// trigger on "const int&")
- const int nine{*Assert(std::optional<int>{9})};
+ const int& nine{*Assert(std::optional<int>{9})};
BOOST_CHECK_EQUAL(9, nine);
}
```
Output:
```
test/util_tests.cpp:128:29: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling-gsl]
const int& nine{*Assert(std::optional<int>{9})};
^~~~~~~~~~~~~~~~~~~~~
./util/check.h:75:50: note: expanded from macro 'Assert'
#define Assert(val) inline_assertion_check<true>(val, __FILE__, __LINE__, __func__, #val)
^~~
1 warning generated.
ACKs for top commit:
jonatack:
ACK fa3ea81c3e
theuni:
ACK fa3ea81c3e
Tree-SHA512: 17dea4d75f2ee2bf6e1b6a6f6d8f439711c777df0390574e8d8edb6ac9ee807a135341e4439050bd6a15ecc4097a1ba9a7ab15d27541ebf70a4e081fa6871877
3fcb545ab2 bench: benchmark transaction creation process (furszy)
a8a75346d7 wallet: SelectCoins, return early if target is covered by preset-inputs (furszy)
f41712a734 wallet: simplify preset inputs selection target check (furszy)
5baedc3351 wallet: remove fetch pre-selected-inputs responsibility from SelectCoins (furszy)
295852f619 wallet: encapsulate pre-selected-inputs lookup into its own function (furszy)
37e7887cb4 wallet: skip manually selected coins from 'AvailableCoins' result (furszy)
94c0766b0c wallet: skip available coins fetch if "other inputs" are disallowed (furszy)
Pull request description:
#### # Context (Current Flow on Master)
In the transaction creation process, in order to select which coins the new transaction will spend,
we first obtain all the available coins known by the wallet, which means walking-through the
wallet txes map, gathering the ones that fulfill certain spendability requirements in a vector.
This coins vector is then provided to the Coin Selection process, which first checks if the user
has manually selected any input (which could be internal, aka known by the wallet, or external),
and if it does, it fetches them by searching each of them inside the wallet and/or inside the
Coin Control external tx data.
Then, after finding the pre-selected-inputs and gathering them in a vector, the Coin Selection
process walks-through the entire available coins vector once more just to erase coins that are
in both vectors. So the Coin Selection process doesn’t pick them twice (duplicate inputs inside
the same transaction).
#### # Process Workflow Changes
Now, a new method, `FetchCoins` will be responsible for:
1) Lookup the user pre-selected-inputs (which can be internal or external).
2) And, fetch the available coins in the wallet (excluding the already fetched ones).
Which will occur prior to the Coin Selection process. Which allows us to never include the
pre-selected-inputs inside the available coins vector in the first place, as well as doing other
nice improvements (written below).
So, Coin Selection can perform its main responsibility without mixing it with having to fetch
internal/external coins nor any slow and unneeded duplicate coins verification.
#### # Summarizing the Improvements:
1) If any pre-selected-input lookup fail, the process will return the error right away.
(before, the wallet was fetching all the wallet available coins, walking through the
entire txes map, and then failing for an invalid pre-selected-input inside SelectCoins)
2) The pre-selected-inputs lookup failure causes are properly described on the return error.
(before, we were returning an "Insufficient Funds" error for everything, even if the failure
was due a not solvable external input)
3) **Faster Coin Selection**: no longer need to "remove the pre-set inputs from the available coins
vector so that Coin Selection doesn't pick them" (which meant to loop-over the entire
available coins vector at Coin Selection time, erasing duplicate coins that were pre-selected).
Now, the available coins vector, which is built after the pre-selected-inputs fetching,
doesn’t include the already selected inputs in the first place.
4) **Faster transaction creation** for transactions that only use manually selected inputs.
We now will return early, as soon as we finish fetching the pre-selected-inputs and
not perform the resources expensive calculation of walking-through the entire wallet
txes map to obtain the available coins (coins that we will not use).
---------------------------
Added a new bench (f6d0bb2) measuring the transaction creation process, for a wallet with ~250k UTXO, only using the pre-selected-inputs inside coin control. Setting `m_allow_other_inputs=false` to disallow the wallet to include coins automatically.
#### Result on this PR (tip f6d0bb2d):
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1,048,675.00 | 953.58 | 0.3% | 0.06 | `WalletCreateTransaction`
vs
#### Result on master (tip 4a4289e2):
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 96,373,458.20 | 10.38 | 0.2% | 5.30 | `WalletCreateTransaction`
The benchmark took to run in master: **96.37 milliseconds**, while in this PR: **1 millisecond** 🚀 .
ACKs for top commit:
S3RK:
Code Review ACK 3fcb545ab2
achow101:
ACK 3fcb545ab2
aureleoules:
reACK 3fcb545ab2
Tree-SHA512: 42f833e92f40c348007ca565a4c98039e6f1ff25d8322bc2b27115824744779baf0b0a38452e4e2cdcba45076473f1028079bbd0f670020481ec5d3db42e4731
Goal 1:
Benchmark the transaction creation process for pre-selected-inputs only.
Setting `m_allow_other_inputs=false` to disallow the wallet to include coins automatically.
Goal 2:
Benchmark the transaction creation process for pre-selected-inputs and coin selection.
-----------------------
Benchmark Setup:
1) Generates a 5k blockchain, loading the wallet with 5k transactions with two outputs each.
2) Fetch 4 random UTXO from the wallet's available coins and pre-select them as inputs inside CoinControl.
Benchmark (Goal 1):
Call `CreateTransaction` providing the coin control, who has set `m_allow_other_inputs=false` and
the manually selected coins.
Benchmark (Goal 2):
Call `CreateTransaction` providing the coin control, who has set `m_allow_other_inputs=true` and
the manually selected coins.
aaaa7bd0ba iwyu: Add missing includes (MacroFake)
fa9ebec096 Remove g_parallel_script_checks (MacroFake)
fa7c834b9f Move ::fCheckBlockIndex into ChainstateManager (MacroFake)
fa43188d86 Move ::fCheckpointsEnabled into ChainstateManager (MacroFake)
cccca83099 Move ::nMinimumChainWork into ChainstateManager (MacroFake)
fa29d0b57c Move ::hashAssumeValid into ChainstateManager (MacroFake)
faf44876db Move ::nMaxTipAge into ChainstateManager (MacroFake)
Pull request description:
It seems preferable to assign globals to a class (in this case `ChainstateManager`), than to leave them dangling. This should clarify scope for code-readers, as well as clarifying unit test behaviour.
ACKs for top commit:
dergoegge:
Code review ACK aaaa7bd0ba
ryanofsky:
Code review ACK aaaa7bd0ba. No changes since last review, other than rebase
aureleoules:
reACK aaaa7bd0ba
Tree-SHA512: 83ec3ba0fb4f1dad95810d4bd4e578454e0718dc1bdd3a794cc4e48aa819b6f5dad4ac4edab3719bdfd5f89cbe23c2740a50fd56c1ff81c99e521c5f6d4e898d
SkipTo() reads data from the file into the CBufferedFile object
(memory), but, unlike this object's read() method, SkipTo() doesn't
transfer data into a caller's memory buffer. This is useful because
after skipping forward in the stream in this way, the user can, if
needed, rewind the stream (SetPos()) and access the object's memory
buffer including ranges that were skipped over (without needing to
read from the disk file).
This changes the flag for the bitcoin-chainstate executable. Previously
it was false, now it is the chain's default value (still false for the
main chain).
If we're connecting to the peer which might support
transaction reconciliation, we announce we want to reconcile
with them.
We store the reconciliation salt so that when the peer
responds with their salt, we are able to compute the
full reconciliation salt.
This behavior is enabled with a CLI flag.
626b7c8493 fuzz: add scanblocks as safe for fuzzing (James O'Beirne)
94fe5453c7 test: rpc: add scanblocks functional test (Jonas Schnelli)
6ef2566b68 rpc: add scanblocks - scan for relevant blocks with descriptors (Jonas Schnelli)
a4258f6e81 rpc: move-only: consolidate blockchain scan args (James O'Beirne)
Pull request description:
Revives #20664. All feedback from the previous PR has either been responded to inline or incorporated here.
---
Major changes from Jonas' PR:
- consolidated arguments for scantxoutset/scanblocks
- substantial cleanup of the functional test
Here's the range-diff (`git range-diff master jonasschnelli/2020/12/filterblocks_rpc jamesob/2021-11-scanblocks`): https://gist.github.com/jamesob/aa4a975344209f0316444b8de2ec1d18
### Original PR description
> The `scanblocks` RPC call allows one to get relevant blockhashes from a set of descriptors by scanning all blockfilters in a given range.
>
> **Example:**
>
> `scanblocks start '["addr(<bitcoin_address>)"]' 661000` (returns relevant blockhashes for `<bitcoin_address>` from blockrange 661000->tip)
>
> ## Why is this useful?
> **Fast wallet rescans**: get the relevant blocks and only rescan those via `rescanblockchain getblockheader(<hash>)[height] getblockheader(<hash>)[height])`. A future PR may add an option to allow to provide an array of blockhashes to `rescanblockchain`.
>
> **prune wallet rescans**: (_needs additional changes_): together with a call to fetch blocks from the p2p network if they have been pruned, it would allow to rescan wallets back to the genesis block in pruned mode (relevant #15946).
>
> **SPV mode** (_needs additional changes_): it would be possible to build the blockfilterindex from the p2p network (rather then deriving them from the blocks) and thus allow some sort of hybrid-SPV mode with moderate bandwidth consumption (related #9483)
ACKs for top commit:
furszy:
diff re-ACK 626b7c8
Tree-SHA512: f84e4dcb851b122b39e9700c58fbc31e899cdcf9b587df9505eaf1f45578cc4253e89ce2a45d1ff21bd213e31ddeedbbcad2c80810f46755b30acc17b07e2873
bf95976061 doc: add note about snapshot chainstate init (James O'Beirne)
e4d7995286 test: add testcases for snapshot initialization (James O'Beirne)
cced4e7336 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne)
51fc9241c0 test: allow on-disk coins and block tree dbs in tests (James O'Beirne)
3c361391b8 test: add reset_chainstate parameter for snapshot unittests (James O'Beirne)
00b357c215 validation: add ResetChainstates() (James O'Beirne)
3a29dfbfb2 move-only: test: make snapshot chainstate setup reusable (James O'Beirne)
8153bd9247 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne)
ad67ff377c validation: remove snapshot datadirs upon validation failure (James O'Beirne)
34d1590331 add utilities for deleting on-disk leveldb data (James O'Beirne)
252abd1e8b init: add utxo snapshot detection (James O'Beirne)
f9f1735f13 validation: rename snapshot chainstate dir (James O'Beirne)
d14bebf100 db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)
---
Half of the replacement for #24232. The original PR grew larger than expected throughout the review process.
This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots.
Don't be scared! There are some big move-only commits in here.
Accompanying changes include:
- moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](https://github.com/bitcoin/bitcoin/pull/24232#discussion_r832762880).
- adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init.
- improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization.
ACKs for top commit:
naumenkogs:
utACK bf95976061
ariard:
Code Review ACK bf9597606
ryanofsky:
Code review ACK bf95976061. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests
fjahr:
utACK bf95976061
aureleoules:
ACK bf95976061
Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
861cb3fadc test: move SyncWithValidationInterfaceQueue() before Stop() in txindex_tests (Vasil Dimov)
6526dc3b78 test: silence TSAN false positive in coinstatsindex_initial_sync (Vasil Dimov)
Pull request description:
Silence false positives from TSAN about unsynchronized calls to `BaseIndex::~BaseIndex()` and `BaseIndex::SetBestBlockIndex()`. They are synchronized, but beyond the comprehension of TSAN - by `SyncWithValidationInterfaceQueue()`, called from `BaseIndex::BlockUntilSyncedToCurrentChain()`.
Fixes https://github.com/bitcoin/bitcoin/issues/25365
ACKs for top commit:
MarcoFalke:
review ACK 861cb3fadc
ryanofsky:
Code review ACK 861cb3fadc. Just comment change since last review.
Tree-SHA512: 8c30fdf2fd11d54e9adfa68a67185ab820bd7bd9f7f3ad6456e7e6d219fa9cf6d34b41e98e723eae86cb0c1baef7f3fc57b1b011a13dc3fe3d78334b9b5596de
b527b54950 net: convert standalone SetSocketNonBlocking() to Sock::SetNonBlocking() (Vasil Dimov)
29f66f7682 moveonly: move SetSocketNonBlocking() from netbase to util/sock (Vasil Dimov)
b4bac55679 net: convert standalone IsSelectableSocket() to Sock::IsSelectable() (Vasil Dimov)
5db7d2ca0a moveonly: move IsSelectableSocket() from compat.h to sock.{h,cpp} (Vasil Dimov)
Pull request description:
_This is a piece of #21878, chopped off to ease review._
* convert standalone `IsSelectableSocket()` to `Sock::IsSelectable()`
* convert standalone `SetSocketNonBlocking()` to `Sock::SetNonBlocking()`
This further encapsulates syscalls inside the `Sock` class and makes the callers mockable.
ACKs for top commit:
jonatack:
ACK b527b54950 review/debug build/unit tests at each commit, cross-referenced the changes with `man select` and `man errno`, ran a signet node on the last commit with ip4/ip6//tor/i2p/cjdns and network connections were nominal
dergoegge:
Code review ACK b527b54950
Tree-SHA512: af783ce558c7a89e173f7ab323fb3517103d765c19b5d14de29f64706b4e1fea3653492e8ea73ae972699986aaddf2ae72c7cfaa7dad7614254283083b7d2632
bcb0cacac2 reindex, log, test: fixes #21379 (mruddy)
Pull request description:
Fixes #21379.
The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process.
The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart.
These additions to the blk files are non-fatal, but also not desirable.
That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them.
The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted).
This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed:
1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`).
1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit:
1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block.
1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown.
ACKs for top commit:
LarryRuane:
tested code-review ACK bcb0cacac2
ryanofsky:
Code review ACK bcb0cacac2. This is a disturbing bug with an easy fix which seems well-worth merging.
mzumsande:
ACK bcb0cacac2 (reviewed code and did some testing, I agree that it fixes the bug).
w0xlt:
tACK bcb0cacac2
Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51
04526787b5 Validate `port` options (amadeuszpawlik)
f8387c4234 Validate port value in `SplitHostPort` (amadeuszpawlik)
Pull request description:
Validate `port`-options, so that invalid values are rejected early in the startup.
Ports are `uint16_t`s, which effectively limits a port's value to <=65535. As discussed in https://github.com/bitcoin/bitcoin/pull/24116 and https://github.com/bitcoin/bitcoin/pull/24344, port "0" is considered invalid too.
Proposed in https://github.com/bitcoin/bitcoin/issues/21893#issuecomment-835784223
The `SplitHostPort(std::string in, uint16_t& portOut, std::string& hostOut)` now returns a bool that indicates whether the port value was set and within the allowed range. This is an improvement that can be used not only for port validation of options at startup, but also in rpc calls, etc,
ACKs for top commit:
luke-jr:
utACK 04526787b5
ryanofsky:
Code review ACK 04526787b5. Just suggested changes since last review: reverting some SplitHostPort changes, adding release notes, avoiding 'GetArgs[0]` problem.
Tree-SHA512: f1ac80bf98520b287a6413ceadb41bc3a93c491955de9b9319ee1298ac0ab982751905762a287e748997ead6198a8bb7a3bc8817ac9e3d2468e11ab4a0f8496d