0
0
Fork 0
mirror of https://github.com/bitcoin/bitcoin.git synced 2025-02-25 12:51:55 -05:00
Commit graph

118 commits

Author SHA1 Message Date
fanquake
ecab855838
Merge bitcoin/bitcoin#28195: blockstorage: Drop legacy -txindex check
fae405556d scripted-diff: Rename CBlockTreeDB -> BlockTreeDB (MarcoFalke)
faf63039cc Fixup style of moved code (MarcoFalke)
fa65111b99 move-only: Move CBlockTreeDB to node/blockstorage (MarcoFalke)
fa8685597e index: Drop legacy -txindex check (MarcoFalke)
fa69148a0a scripted-diff: Use blocks_path where possible (MarcoFalke)

Pull request description:

  The only reason for the check was to print a warning about an increase in storage use. Now that 22.x is EOL and everyone should have migrated (or decided to not care about storage use), remove the check.

  Also, a move-only commit is included. (Rebased from https://github.com/bitcoin/bitcoin/pull/22242)

ACKs for top commit:
  TheCharlatan:
    ACK fae405556d, though I lack historical context to really judge the second commit fa8685597e.
  stickies-v:
    ACK fae405556d

Tree-SHA512: 9da8f48767ae52d8e8e21c09a40c949cc0838794f1856cc5f58a91acd3f00a3bca818c8082242b3fdc9ca5badb09059570bb3870850d3807b75a8e23b5222da1
2023-09-05 11:37:35 +01:00
fanquake
be44332803
Merge bitcoin/bitcoin#28191: refactor: Remove unused MessageStartChars parameters from BlockManager methods
fa69e3a95c Remove unused MessageStartChars parameters from BlockManager methods (MarcoFalke)

Pull request description:

  Seems odd to expose these for mocking, when it is not needed.

  Fix this by removing the the unused parameters and use the already existing member field instead.

ACKs for top commit:
  Empact:
    utACK fa69e3a95c
  dergoegge:
    utACK fa69e3a95c

Tree-SHA512: 7814e9560abba8d9c0926bcffc70f92e502d22f543af43671248f6fcd1433f35238553c0f05123fde6d8e0f80261af0ab0500927548115153bd68d57fe2da746
2023-08-07 10:57:39 +02:00
MarcoFalke
fae405556d
scripted-diff: Rename CBlockTreeDB -> BlockTreeDB
-BEGIN VERIFY SCRIPT-
 sed -i 's|CBlockTreeDB|BlockTreeDB|g' $( git grep -l CBlockTreeDB )
-END VERIFY SCRIPT-
2023-08-02 07:49:32 +02:00
MarcoFalke
faf63039cc
Fixup style of moved code
Can be reviewed with --word-diff-regex=.
2023-08-01 15:27:51 +02:00
MarcoFalke
fa65111b99
move-only: Move CBlockTreeDB to node/blockstorage
The block index (CBlockTreeDB) is required to write and read blocks, so
move it to blockstorage. This allows to drop the txdb.h include from
`node/blockstorage.h`.

Can be reviewed with:

--color-moved=dimmed-zebra  --color-moved-ws=ignore-all-space
2023-08-01 15:27:33 +02:00
MarcoFalke
fa69e3a95c
Remove unused MessageStartChars parameters from BlockManager methods 2023-07-31 14:32:57 +02:00
Suhas Daftuar
d0d40ea9a6 Move block-storage-related logic to ChainstateManager
Separate the notion of which blocks are stored on disk, and what data is in our
block index, from what tip a chainstate might be able to get to. We can use
chainstate-agnostic data to determine when to store a block on disk (primarily,
an anti-DoS set of criteria) and let the chainstates figure out for themselves
when a block is of interest for being a candidate tip.

Note: some of the invariants in CheckBlockIndex are modified, but more work is
needed (ie to move CheckBlockIndex to ChainstateManager, as most of what
CheckBlockIndex is doing is checking the consistency of the block index, which
is outside of Chainstate).
2023-07-21 10:09:44 -04:00
Suhas Daftuar
1cfc887d00 Remove CChain dependency in node/blockstorage 2023-07-14 14:54:57 -04:00
Suhas Daftuar
fe86a7cd48 Explicitly track maximum block height stored in undo files
When writing a new block to disk, if we have filled up the current block file,
then we flush and truncate that block file (to free allocated but unused
space) before advancing to the next one. When this happens, we have to
determine whether to also flush and truncate the corresponding undo file.

Undo data is only written when blocks are connected, not when blocks are
received. Thus it's possible that the corresponding undo file already has all
the data it will ever have, and we should flush/truncate it as we advance
files; or it's possible that there is more data we expect to write, and should
therefore defer flush/truncation until undo data is later written.

Prior to this commit, we made the determination of whether the undo file was
full of all requisite data by comparing against the chain tip. This patch
replaces that dependence on validation data structures by instead just tracking
the highest height of any block written in the undo file as we go.
2023-07-14 14:47:00 -04:00
TheCharlatan
462390c85f
refactor: Move stopafterblockimport handling out of blockstorage
This has the benefit of moving the StartShutdown call out of the
blockstorage file and thus out of the kernel's responsibility. The user
can now decide if he wants to start shutdown / interrupt after a block
import or not.
2023-07-11 12:00:57 +02:00
furszy
ca91c244ef
index: verify blocks data existence only once
At present, during init, we traverse the chain (once per index)
to confirm that all necessary blocks to sync each index up to
the current tip are present.

To make the process more efficient, we can fetch the oldest block
from the indexers and perform the chain data existence check from
that point only once.

This also moves the pruning violation check to the end of the
'loadinit' thread, which is where the reindex, block loading and
chain activation processes happen.

Making the node's startup process faster, allowing us to remove
the global g_indexes_ready_to_sync flag, and enabling the
execution of the pruning violation verification even when the
reindex or reindex-chainstate flags are enabled (which has being
skipped so far).
2023-07-10 10:50:50 -03:00
furszy
2ec89f1970
refactor: simplify pruning violation check
By generalizing 'GetFirstStoredBlock' and implementing
'CheckBlockDataAvailability' we can dedup code and
avoid repeating work when multiple indexes are enabled.
E.g. get the oldest block across all indexes and
perform the pruning violation check from that point
up to the tip only once (this feature is being introduced
in a follow-up commit).

This commit shouldn't change behavior in any way.

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
2023-07-10 10:50:50 -03:00
furszy
c82ef91eae
make GetFirstStoredBlock assert that 'start_block' always has data
And transfer the responsibility of verifying whether 'start_block'
has data or not to the caller.

This is because the 'GetFirstStoredBlock' function responsibility
is to return the first block containing data. And the current
implementation can return 'start_block' when it has no data!. Which
is misleading at least.

Edge case behavior change:
Previously, if the block tip lacked data but all preceding blocks
contained data, there was no prune violation. And now, such
scenario will result in a prune violation.
2023-07-10 10:47:17 -03:00
furszy
04575106b2
scripted-diff: rename 'loadblk' thread name to 'initload'
The thread does not only load blocks, it loads the mempool and,
in a future commit, will start the indexes as well.

Also, renamed the 'ThreadImport' function to 'ImportBlocks'
And the 'm_load_block' class member to 'm_thread_load'.

-BEGIN VERIFY SCRIPT-

sed -i "s/ThreadImport/ImportBlocks/g" $(git grep -l ThreadImport -- ':!/doc/')
sed -i "s/loadblk/initload/g" $(git grep -l loadblk -- ':!/doc/release-notes/')
sed -i "s/m_load_block/m_thread_load/g" $(git grep -l m_load_block)

-END VERIFY SCRIPT-
2023-07-07 19:31:27 -03:00
furszy
ed4462cc78
init: start indexes sync earlier
The mempool load can take a while, and it is not
needed for the indexes' synchronization.

Also, having the mempool load function call
inside 'blockstorage.cpp' wasn't structurally
correct.
2023-07-07 19:31:26 -03:00
TheCharlatan
6eb33bd0c2
kernel: Add fatalError method to notifications
FatalError replaces what previously was the AbortNode function in
shutdown.cpp.

This commit is part of the libbitcoinkernel project and further removes
the shutdown's and, more generally, the kernel library's dependency on
interface_ui with a kernel notification method. By removing interface_ui
from the kernel library, its dependency on boost is reduced to just
boost::multi_index. At the same time it also takes a step towards
de-globalising the interrupt infrastructure.

Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
Co-authored-by: TheCharlatan <seb.kung@gmail.com>
2023-06-28 09:52:33 +02:00
TheCharlatan
7320db96f8
kernel: Add flushError method to notifications
This is done in addition with the following commit. Both have the goal
of getting rid of direct calls to AbortNode from kernel code. This extra
flushError method is added to notify specifically about errors that
arrise when flushing (syncing) block data to disk. Unlike other
instances, the current calls to AbortNode in the blockstorage flush
functions do not report an error to their callers.

This commit is part of the libbitcoinkernel project and further removes
the shutdown's and, more generally, the kernel library's dependency on
interface_ui with a kernel notification method. By removing interface_ui
from the kernel library, its dependency on boost is reduced to just
boost::multi_index. At the same time it also takes a step towards
de-globalising the interrupt infrastructure.
2023-06-28 09:52:32 +02:00
TheCharlatan
edb55e2777
kernel: Pass interrupt reference to chainman
This and the following commit seek to decouple the libbitcoinkernel
library from the shutdown code. As a library, it should it should have
its own flexible interrupt infrastructure without relying on node-wide
globals.

The commit takes the first step towards this goal by de-globalising
`ShutdownRequested` calls in kernel code.

Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
Co-authored-by: TheCharlatan <seb.kung@gmail.com>
2023-06-28 09:52:27 +02:00
Andrew Chow
caff95a023
Merge bitcoin/bitcoin#27896: Remove the syscall sandbox
32e2ffc393 Remove the syscall sandbox (fanquake)

Pull request description:

  After initially being merged in #20487, it's no-longer clear that an internal syscall sandboxing mechanism is something that Bitcoin Core should have/maintain, especially when compared to better maintained/supported alterantives, i.e [firejail](https://github.com/netblue30/firejail).

  There is more related discussion in #24771.

  Note that given where it's used, the sandbox also gets dragged into the kernel.

  If it's removed, this should not require any sort of deprecation, as this was only ever an opt-in, experimental feature.

  Closes #24771.

ACKs for top commit:
  davidgumberg:
     crACK 32e2ffc393
  achow101:
    ACK 32e2ffc393
  dergoegge:
    ACK 32e2ffc393

Tree-SHA512: 8cf71c5623bb642cb515531d4a2545d806e503b9d57bfc15a996597632b06103d60d985fd7f843a3c1da6528bc38d0298d6b8bcf0be6f851795a8040d71faf16
2023-06-27 18:19:21 -04:00
fanquake
32e2ffc393
Remove the syscall sandbox
After initially being merged in #20487, it's no-longer clear that an
internal syscall sandboxing mechanism is something that Bitcoin Core
should have/maintain, especially when compared to better
maintained/supported alterantives, i.e firejail.

Note that given where it's used, the sandbox also gets dragged into the
kernel.

There is some related discussion in #24771.

This should not require any sort of deprecation, as this was only ever
an opt-in, experimental feature.

Closes #24771.
2023-06-16 10:38:19 +01:00
Jon Atack
daa5a658c0 refactor: rename BCLog::BLOCKSTORE to BLOCKSTORAGE
so the enum name is the same as its value, like the other BCLog enums.
2023-06-15 10:27:56 -06:00
furszy
9ddf7e03a3
move ThreadImport ABC error to use AbortNode
'StartShutdown' should only be used for user requested
shutdowns. Internal errors that cause a shutdown should
use 'AbortNode'.
2023-06-08 16:38:36 -03:00
TheCharlatan
7d3b35004b
refactor: Move system from util to common library
Since the kernel library no longer depends on the system file, move it
to the common library instead in accordance to the diagram in
doc/design/libraries.md.
2023-05-20 12:08:13 +02:00
TheCharlatan
9ec5da36b6
refactor: Move ScheduleBatchPriority to its own file
With the previous move of AlertNotify out of the validation file, and
thus out of the kernel library, ScheduleBatchPriority is the last
remaining function used by the kernel library from util/system. Move it
to its own file, such that util/system can be moved out of the util
library in the following few commits.

Moving util/system out of the kernel library removes further networking
as well as shell related code from it.
2023-05-20 12:03:30 +02:00
Martin Zumsande
97844d9268 index: Enable reindex-chainstate with active indexes
This is achieved by letting the index sync thread wait until
reindex-chainstate is finished.

This also disables the pruning check when reindexing the chainstate (which is
incompatible with prune mode) because there would be no chain at this point
in init.
2023-05-17 11:14:28 -04:00
TheCharlatan
5ff63a09a9
refactor, blockstorage: Replace stopafterblockimport arg
Add a stop_after_block_import field to the BlockManager options. Use
this field instead of the global gArgs.

This should allow users of the BlockManager to not rely on the global
Args.
2023-05-10 19:07:46 +02:00
TheCharlatan
18e5ba7c80
refactor, blockstorage: Replace blocksdir arg
Add a blocks_dir field to the BlockManager options. Move functions
relying on the global gArgs to get the blocks_dir into the BlockManager
class.

This should eventually allow users of the BlockManager to not rely on
the global Args and instead pass in their own options.
2023-05-10 19:07:44 +02:00
TheCharlatan
02a0899527
refactor, BlockManager: Replace fastprune from arg with options
Remove access to the global gArgs for the fastprune argument and
replace it by adding a field to the existing BlockManager Options
struct.

When running `clang-tidy-diff` on this commit, there is a diagnostic
error: `unknown type name 'uint64_t' [clang-diagnostic-error] uint64_t
prune_target{0};`, which is fixed by including cstdint.

This should eventually allow users of the BlockManager to not rely on
the global gArgs and instead pass in their own options.
2023-05-10 19:07:42 +02:00
TheCharlatan
f0bb1021f0
refactor: Move functions to BlockManager methods
This is a commit in preparation for the next few commits. The functions
are moved to methods to avoid their re-declaration for the purpose of
passing in BlockManager options.

The functions that were now moved into the BlockManager should no longer
use the params as an argument, but instead use the member variable.

In the moved ReadBlockFromDisk and UndoReadFromDisk, change
the function signature to accept a reference to a CBlockIndex instead of
a raw pointer. The pointer is expected to be non-null, so reflect that
in the type.

To allow for the move of functions to BlockManager methods all call
sites require an instantiated BlockManager, or a callback to one.
2023-05-10 19:06:53 +02:00
MarcoFalke
fa5d7c39eb
Remove unused chainparams from BlockManager methods
Also, replace pointer with reference while touching the signature.
2023-05-04 19:27:23 +02:00
MarcoFalke
fa3f74a40e
Replace pindex pointer with block reference
pindex can not be nullptr, so document that, and clear it up in the next
commit.
2023-05-04 19:26:48 +02:00
MarcoFalke
facdb8b331
Add BlockManagerOpts::chainparams reference
and use it in blockstorage.cpp
2023-05-04 19:26:43 +02:00
fanquake
8a373a5c7f
Merge bitcoin/bitcoin#27191: blockstorage: Adjust fastprune limit if block exceeds blockfile size
8f14fc8622 test: cover fastprune with excessive block size (Matthew Zipkin)
271c23e87f blockstorage: Adjust fastprune limit if block exceeds blockfile size (Martin Zumsande)

Pull request description:

  The debug-only `-fastprune` option used in several tests is not always safe to use:
  If a `-fastprune` node receives a block larger than the maximum blockfile size of `64kb` bad things happen: The while loop in `BlockManager::FindBlockPos` never terminates, and the node runs oom because memory for `m_blockfile_info` is allocated in each iteration of the loop.
  The same would happen if a naive user used `-fastprune` on anything other than regtest (so this can be tested by syncing on signet for example, the first block that crashes the node is at height 2232).

  Change the approach by raising the blockfile size to the size of the block, if that block otherwise wouldn't fit (idea by TheCharlatan).

ACKs for top commit:
  ryanofsky:
    Code review ACK 8f14fc8622. Added new assert, test, and comment since last review
  TheCharlatan:
    ACK 8f14fc8622
  pinheadmz:
    ACK 8f14fc8622

Tree-SHA512: df2fea30613ef9d40ebbc2416eacb574f6d7d96847db5c33dda22a29a2c61a8db831aa9552734ea4477e097f253dbcb6dcb1395d43d2a090cc0588c9ce66eac3
2023-05-02 10:04:34 +01:00
Martin Zumsande
271c23e87f blockstorage: Adjust fastprune limit if block exceeds blockfile size
If the added block exceeds the blockfile size in test-only
-fastprune mode, the node would get stuck in an infinite loop and
run out of memory.

Avoid this by raising the blockfile size to the size of the added block
in this situation.

Co-authored-by: TheCharlatan <seb.kung@gmail.com>
2023-04-19 11:25:07 -04:00
TheCharlatan
be55f545d5
move-only: Extract common/args and common/config.cpp from util/system
This is an extraction of ArgsManager related functions from util/system
into their own common file.

Config file related functions are moved to common/config.cpp.

The background of this commit is an ongoing effort to decouple the
libbitcoinkernel library from the ArgsManager. The ArgsManager belongs
into the common library, since the kernel library should not depend on
it. See doc/design/libraries.md for more information on this rationale.
2023-04-19 10:48:30 +02:00
TheCharlatan
00e9b97f37
refactor: Move fs.* to util/fs.*
The fs.* files are already part of the libbitcoin_util library. With the
introduction of the fs_helpers.* it makes sense to move fs.* into the
util/ directory as well.
2023-03-23 12:55:18 +01:00
fanquake
e695d8536e
Merge bitcoin/bitcoin#26177: refactor / kernel: Move non-gArgs chainparams functionality to kernel
b3e78dc91d refactor: Don't use global chainparams in chainstatemanager method (TheCharlatan)
382b692a50 Split non/kernel chainparams (Carl Dong)
edabbc78a3 Add factory functions for Main/Test/Sig/Reg chainparams (Carl Dong)
d938098398 Remove UpdateVersionBitsParameters (Carl Dong)
84b85786f0 Decouple RegTestChainParams from ArgsManager (Carl Dong)
76cd4e7c96 Decouple SigNetChainParams from ArgsManager (Carl Dong)

Pull request description:

  This pull request is part of the `libbitcoinkernel` project https://github.com/bitcoin/bitcoin/issues/24303 https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". dongcarl is the original author of this patchset, these commits were taken from https://github.com/dongcarl/bitcoin/tree/2022-03-libbitcoinkernel-chainparams-args-only.

  #### Context

  The bitcoin kernel library currently relies on code containing user configurations through the `ArgsManager`. This is not optimal, since as a stand-alone library it should not rely on bitcoind's argument parsing logic. Instead, its interfaces should accept control and options structs that control the kernel library's desired configuration.

  Similar work towards decoupling the `ArgsManager` from the kernel has been done in
  https://github.com/bitcoin/bitcoin/pull/25290, https://github.com/bitcoin/bitcoin/pull/25487, https://github.com/bitcoin/bitcoin/pull/25527 and https://github.com/bitcoin/bitcoin/pull/25862.

  #### Changes

  By moving the `CChainParams` class definition into the kernel and giving it new factory functions `CChainParams::{RegTest,SigNet,Main,TestNet}`it can be constructed without an `ArgsManager` reference, unlike the current factory function `CreateChainParams`.

  The first few commits remove uses of `ArgsManager` within `CChainParams`. Then the `CChainParams` definition is moved to a new file in the `kernel/` subdirectory.

ACKs for top commit:
  MarcoFalke:
    re-ACK b3e78dc91d 🛁
  ryanofsky:
    Code review ACK b3e78dc91d. Only changes since last review were recent review suggestions.
  ajtowns:
    ACK b3e78dc91d

Tree-SHA512: 3835aca1d3e3c75cc3303dd584bab3a77e58f6c678724a5e359fe4b0e17e0763a00931ee6191f516b9fde50496f59cc691f0709c0254206db3863bbf7ab2cacd
2023-03-16 13:56:35 +00:00
Carl Dong
382b692a50
Split non/kernel chainparams
Moves chainparams code not using the ArgsManager to the kernel.

Subsequently use the kernel chainparams header now where possible in
order to further decouple chainparams call sites from gArgs.
2023-03-15 16:43:31 +01:00
MarcoFalke
fa9bd7be47
Move ::fImporting to BlockManager 2023-03-15 15:48:44 +01:00
MarcoFalke
fa442b1377
Pass fImporting to ImportingNow helper class 2023-03-15 15:47:48 +01:00
MarcoFalke
fa177d7b6b
Move ::fPruneMode into BlockManager 2023-03-15 15:47:42 +01:00
MarcoFalke
fa721f1cab
Move ::nPruneTarget into BlockManager 2023-03-15 15:33:12 +01:00
Ben Woosley
aaced5633b
refactor: Move error() from util/system.h to logging.h
error is a low-level function with a sole dependency on LogPrintf, which
is defined in logging.h

The background of this commit is an ongoing effort to decouple the
libbitcoinkernel library from the ArgsManager defined in system.h.
Moving the function out of system.h allows including it from a separate
source file without including the ArgsManager definitions from system.h.
2023-03-13 17:09:54 +01:00
Andrew Chow
bb136aaf2c
Merge bitcoin/bitcoin#26533: prune: scan and unlink already pruned block files on startup
3141eab9c6 test: add functional test for ScanAndUnlinkAlreadyPrunedFiles (Andrew Toth)
e252909e56 test: add unit test for ScanAndUnlinkAlreadyPrunedFiles (Andrew Toth)
77557dda4a prune: scan and unlink already pruned block files on startup (Andrew Toth)

Pull request description:

  There are a few cases where we can mark a block and undo file as pruned in our block index, but not actually remove the files from disk.
  1. If we call `FindFilesToPrune` or `FindFilesToPruneManual` and crash before `UnlinkPrunedFiles`.
  2. If on Windows there is an open file handle to the file somewhere else when calling `fs::remove` in `UnlinkPrunedFiles` (https://en.cppreference.com/w/cpp/filesystem/remove, https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-deletefilew#remarks). This could be from another process, or if we are calling `ReadBlockFromDisk`/`ReadRawBlockFromDisk` without having a lock on `cs_main` (which has been allowed since ccd8ef65f9).

  This PR mitigates this by scanning all pruned block files on startup after `LoadBlockIndexDB` and unlinking them again.

ACKs for top commit:
  achow101:
    ACK 3141eab9c6
  pablomartin4btc:
    re-ACK with added functional test 3141eab9c6.
  furszy:
    Code review ACK 3141eab9
  theStack:
    Code-review ACK 3141eab9c6

Tree-SHA512: 6c73bc57838ad1b7e5d441af3c4d6bf4c61c4382e2b86485e57fbb74a61240710c0ceeceb8b4834e610ecfa3175c6955c81ea4b2285fee11ca6383f472979d8d
2023-02-28 09:54:10 -05:00
MarcoFalke
eeee61065f
Use AutoFile and HashVerifier where possible 2023-01-03 12:55:29 +01:00
Hennadii Stepanov
306ccd4927
scripted-diff: Bump copyright headers
-BEGIN VERIFY SCRIPT-
./contrib/devtools/copyright_header.py update ./
-END VERIFY SCRIPT-

Commits of previous years:
- 2021: f47dda2c58
- 2020: fa0074e2d8
- 2019: aaaaad6ac9
2022-12-24 23:49:50 +00:00
Andrew Toth
77557dda4a prune: scan and unlink already pruned block files on startup 2022-12-20 12:25:36 -05:00
Andrew Chow
6912a28f08
Merge bitcoin/bitcoin#25667: assumeutxo: snapshot initialization
bf95976061 doc: add note about snapshot chainstate init (James O'Beirne)
e4d7995286 test: add testcases for snapshot initialization (James O'Beirne)
cced4e7336 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne)
51fc9241c0 test: allow on-disk coins and block tree dbs in tests (James O'Beirne)
3c361391b8 test: add reset_chainstate parameter for snapshot unittests (James O'Beirne)
00b357c215 validation: add ResetChainstates() (James O'Beirne)
3a29dfbfb2 move-only: test: make snapshot chainstate setup reusable (James O'Beirne)
8153bd9247 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne)
ad67ff377c validation: remove snapshot datadirs upon validation failure (James O'Beirne)
34d1590331 add utilities for deleting on-disk leveldb data (James O'Beirne)
252abd1e8b init: add utxo snapshot detection (James O'Beirne)
f9f1735f13 validation: rename snapshot chainstate dir (James O'Beirne)
d14bebf100 db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne)

Pull request description:

  This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)

  ---

  Half of the replacement for #24232. The original PR grew larger than expected throughout the review process.

  This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots.

  Don't be scared! There are some big move-only commits in here.

  Accompanying changes include:

  - moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](https://github.com/bitcoin/bitcoin/pull/24232#discussion_r832762880).
  - adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init.
  - improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization.

ACKs for top commit:
  naumenkogs:
    utACK bf95976061
  ariard:
    Code Review ACK bf9597606
  ryanofsky:
    Code review ACK bf95976061. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests
  fjahr:
    utACK bf95976061
  aureleoules:
    ACK bf95976061

Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
2022-10-13 10:19:27 -04:00
glozow
cc12b8947b
Merge bitcoin/bitcoin#24858: incorrect blk file size calculation during reindex results in recoverable blk file corruption
bcb0cacac2 reindex, log, test: fixes #21379 (mruddy)

Pull request description:

  Fixes #21379.

  The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process.
  The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart.
  These additions to the blk files are non-fatal, but also not desirable.
  That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them.

  The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted).

  This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed:
  1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`).
  1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit:
      1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block.
      1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown.

ACKs for top commit:
  LarryRuane:
    tested code-review ACK bcb0cacac2
  ryanofsky:
    Code review ACK bcb0cacac2. This is a disturbing bug with an easy fix which seems well-worth merging.
  mzumsande:
    ACK bcb0cacac2 (reviewed code and did some testing, I agree that it fixes the bug).
  w0xlt:
    tACK bcb0cacac2

Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51
2022-10-12 14:13:54 -04:00
James O'Beirne
8153bd9247 blockmanager: avoid undefined behavior during FlushBlockFile
If we call FlushBlockFile() without having intitialized the block index
with LoadBlockIndexDB(), we may be indexing into an empty vector.

Specifically this is an issue when we call MaybeRebalanceCaches() during
chainstate init before the block index has been loaded, which calls
FlushBlockFile().

Also add an assert to avoid undefined behavior.

Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
2022-09-13 13:30:25 -04:00