In order to prevent memory DoS, we must ensure that we don't accept a new
header into memory until we've performed anti-DoS checks, such as verifying
that the header is part of a sufficiently high work chain. This commit adds a
new argument to AcceptBlockHeader() so that we can ensure that all call-sites
which might cause a new header to be accepted into memory have to grapple with
the question of whether the header is safe to accept, or needs further
validation.
This patch also fixes two places where low-difficulty-headers could have been
processed without such validation (processing an unrequested block from the
network, and processing a compact block).
Credit to Niklas Gögge for noticing this issue, and thanks to Sjors Provoost
for test code.
Avoid permanently storing headers from a peer, unless the headers are part of a
chain with sufficiently high work. This prevents memory attacks using low-work
headers.
Designed and co-authored with Pieter Wuille.
1dc03dda05 [doc] remove non-signaling mentions of BIP125 (glozow)
32024d40f0 scripted-diff: remove mention of BIP125 from non-signaling var names (glozow)
Pull request description:
We have pretty thorough documentation of our RBF policy in doc/policy/mempool-replacements.md. It enumerates each rule with several sentences of rationale. Also, each rule pretty much has its own function (3 and 4 share one), with extensive comments. The doc states explicitly that our rules are similar but differ from BIP125, and contains a record of historical changes to RBF policy.
We should not use "BIP125" as synonymous with our RBF policy because:
- Our RBF policy is different from what is specified in BIP125, for example:
- the BIP does not mention our rule about the replacement feerate being higher (our Rule 6)
- the BIP uses minimum relay feerate for Rule 4, while we have used incremental relay feerate since #9380
- the "inherited signaling" question (CVE-2021-31876). Call it discrepancy, ambiguous wording, doc misinterpretation, or implementation details, I would recommend users refer to doc/policy/mempool-replacements.md
- the signaling policy is configurable, see #25353
- Our RBF policy may change further
- We have already marked BIP125 as only "partially implemented" in docs/bips.md since 1fd49eb498
- See comments from people who are not me recently:
- https://github.com/bitcoin/bitcoin/pull/25038#discussion_r909507429
- https://github.com/bitcoin/bitcoin/pull/25575#issuecomment-1179519204
This PR removes all non-signaling mentions of BIP125 (if people feel strongly, we can remove all mentions of BIP125 period). It may be useful to refer to the concept of "tx opts in to RBF if it has at least one nSequence less than (0xffffffff - 1)" as "BIP125 signaling" because:
- It is succint.
- It has already been widely marketed as BIP125 opt-in signaling.
- Our API uses it when referring to signaling (e.g. getmempoolentry["bip125-replaceable"] and wallet error message "not BIP 125 replaceable"). Changing those is more invasive.
- If/when we have other ways to signal in the future, we can disambiguate them this way. See #25038 which proposes another way of signaling, and where I pulled these commits from.
Alternatives:
- Changing our policy to match BIP125. This doesn't make sense as, for example, we would have to remove the requirement that a replacement tx has a higher feerate (Rule 6).
- Changing BIP125 to match what we have. This doesn't make sense as it would be a significant change to a BIP years after it was finalized and already used as a spec to implement RBF in other places.
- Document our policy as a new BIP and give it a number. This might make sense if we don't expect things to change a lot, and can be done as a next step.
ACKs for top commit:
darosior:
ACK 1dc03dda05
ariard:
ACK 1dc03dda
t-bast:
ACK 1dc03dda05
Tree-SHA512: a3adc2039ec5785892d230ec442e50f47f7062717392728152bbbe27ce1c564141f85253143f53cb44e1331cf47476d74f5d2f4b3cd873fc3433d7a0aa783e02
eeee5ada23 Make adjusted time type safe (MacroFake)
fa3be799fe Add time helpers (MacroFake)
Pull request description:
This makes follow-ups easier to review. Also, it makes sense by itself.
ACKs for top commit:
ryanofsky:
Code review ACK eeee5ada23. Confirmed type changes and equivalent code changes only.
Tree-SHA512: 51bf1ae5428552177286113babdd49e82459d6c710a07b6e80a0a045d373cf51045ee010461aba98e0151d8d71b9b3b5f8f73e302d46ba4558e0b55201f99e9f
This is a refactor, putting the burden to think about thread safety to
the caller. Otherwise, there is a risk that the caller will assume
thread safety where none exists, as is evident in the previous two
commits.
9376a6dae4 refactor: make active_chain_tip a reference (Aurèle Oulès)
Pull request description:
This PR fixes a TODO introduced in #21055.
Makes `active_chain_tip` argument in `CheckFinalTxAtTip` function a reference instead of a pointer.
ACKs for top commit:
dongcarl:
ACK 9376a6dae4
Tree-SHA512: c36d1769e0b9598b7f79334704b26b73e958d54caa3bd7e4eff954f3964fcf3f5e3a44a5a760497afad51b76e1614c86314fe035e4083c855e3574a620de7f4d
fa86710187 Clarify that CheckSequenceLocksAtTip is a validation function (MarcoFalke)
Pull request description:
It has been pointed out that a bug in this function can prevent block template creation. ( https://github.com/bitcoin/bitcoin/pull/24080#issuecomment-1065148776 ) So it seems that the scope of this function is more than "policy". Rename it back to "validation", to partially revert commit fa4e30b0f3.
ACKs for top commit:
ajtowns:
ACK fa86710187 - looks fine to me
glozow:
ACK fa86710187
Tree-SHA512: 2e0df8c70df4cbea857977f140a8616cfa7505e74df66c9c9fbcf184670ce3ce7567183c3f76e6f3fe8ca6de0e065b9babde6352d6cb495e71ea077ddedbc3f4
Our RBF policy is different from the rules specified in BIP125. For
example, the BIP does not mention Rule 6, and our Rule 4 uses the
(configurable) incremental relay feerate (distinct from the
minimum relay feerate). Those interested in our policy should refer to
doc/policy/mempool-replacements.md instead. These rules may also
continue to diverge with package RBF and other RBF improvements. Keep
references to the BIP125 signaling wrt sequence numbers, since that is
still correct and widely used. It is helpful to refer to this as "BIP125
signaling" since it is unambiguous and succint, especially if we have
multiple ways to signal replaceability in the future.
The rule numbers in doc/policy/mempool-replacements.md correspond
largely to those of BIP 125, so we can still refer to them like "Rule 5."
Also:
- Make DEFAULT_MAX_SIG_CACHE_SIZE into constexpr
DEFAULT_MAX_SIG_CACHE_BYTES to utilize the compile-time integer
arithmetic overflow checking available to constexpr.
- Fix comment (MiB instead of MB) for DEFAULT_MAX_SIG_CACHE_BYTES.
- Pass in max_size_bytes parameter to InitS*Cache(), modify log line to
no longer allude to maxsigcachesize being split evenly between the two
validation caches.
- Fix possible integer truncation and add a comment.
[META] I've kept the integer types as int64_t in order to not introduce
unintended behaviour changes, in the next commit we will make
them size_t.
Returning the approximate total size eliminates the need for
InitS*Cache() to do nElems*sizeof(uint256). The cuckoocache has a better
idea of this information.
dd065dae9f refactor: Make mapBlocksUnknownParent local, and rename it (Hennadii Stepanov)
Pull request description:
This PR is a second attempt at #19594. This PR has two motivations:
- Improve code hygiene by eliminating a global variable, `mapBlocksUnknownParent`
- Fix fuzz test OOM when running too long ([see #19594 comment](https://github.com/bitcoin/bitcoin/pull/19594#issuecomment-958801638))
A minor added advantage is to release `mapBlocksUnknownParent` memory when the reindexing phase is done. The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map.
This PR helps our efforts to reduce the use of global variables. This variable isn't just global, it's hidden inside a function (it looks like a local variable but has the `static` attribute).
This global variable exists because the `-reindex` processing code calls `LoadExternalBlockFile()` multiple times (once for each block file), but that function must preserve some state between calls (the `mapBlocksUnknownParent` map). This PR fixes this by allocating this map as a local variable in the caller's scope and passing it in on each call. When reindexing completes, the map goes out of scope and is deallocated.
I tested this manually by reindexing on mainnet and signet. Also, the existing `feature_reindex.py` functional test passes.
ACKs for top commit:
mzumsande:
re-ACK dd065dae9f
theStack:
re-ACK dd065dae9f
shaavan:
reACK dd065dae9f
Tree-SHA512: 9cd20e44d2fa1096dd405bc107bc065ea8f904f5b3f63080341b08d8cf57b790df565f58815c2f331377d044d5306708b4bf6bdfc5ef8d0ed85d8e97d744732c
facc2fa7b8 Use AutoFile where possible (MacroFake)
6666803c89 streams: Add AutoFile without ser-type and ser-version (MacroFake)
Pull request description:
This was done in the context of https://github.com/bitcoin/bitcoin/pull/25284 , but I think it also makes sense standalone.
The basic idea is that serialization type should not be initialized when it is not needed. Same for the serialization version.
So do this here for `AutoFile`. `CAutoFile` remains in places where it is not yet possible.
ACKs for top commit:
laanwj:
Code review ACK facc2fa7b8
fanquake:
ACK facc2fa7b8
Tree-SHA512: d82d024d55af57565ac53d9d1517afafc12b46964effba0332de62a6c77869356fa77f89e6d4834438fff44c45b64fccdf5a1358bfea03e28dfe55013b3c099d
cb3e9a1e3f Move {Load,Dump}Mempool to kernel namespace (Carl Dong)
aa30676541 Move DEFAULT_PERSIST_MEMPOOL out of libbitcoinkernel (Carl Dong)
06b88ffb8a LoadMempool: Pass in load_path, stop using gArgs (Carl Dong)
b857ac60d9 test/fuzz: Invoke LoadMempool via CChainState (Carl Dong)
b3267258b0 Move FopenFn to fsbridge namespace (Carl Dong)
ae1e8e3756 mempool: Use NodeClock+friends for LoadMempool (Carl Dong)
f9e8e5719f mempool: Improve comments for [GS]etLoadTried (Carl Dong)
813962da0b scripted-diff: Rename m_is_loaded -> m_load_tried (Carl Dong)
413f4bb52b DumpMempool: Pass in dump_path, stop using gArgs (Carl Dong)
bd4407817e DumpMempool: Use std::chrono instead of weird int64_t arthmetics (Carl Dong)
c84390b741 test/mempool_persist: Test manual savemempool when -persistmempool=0 (Carl Dong)
Pull request description:
This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18
-----
This PR moves `{Dump,Load}Mempool` into its own `kernel/mempool_persist` module and introduces `ArgsManager` `node::` helpers in `node/mempool_persist_args`to remove the scattered calls to `GetBoolArg("-persistmempool", DEFAULT_PERSIST_MEMPOOL)`.
More context can be gleaned from the commit messages.
-----
One thing I was reflecting on as I wrote this was that in the long run, I think we should probably invert the validation <-> mempool relationship. Instead of mempool not depending on validation, it might make more sense to have validation not depend on mempool. Not super urgent since `libbitcoinkernel` will include both validation and mempool, but perhaps something for the future.
ACKs for top commit:
glozow:
re ACK cb3e9a1e3f via `git range-diff 7ae032e...cb3e9a1`
MarcoFalke:
ACK cb3e9a1e3f🔒
ryanofsky:
Code review ACK cb3e9a1e3f
Tree-SHA512: 979d7237c3abb5a1dd9b5ad3dbf3b954f906a6d8320ed7b923557f41a4472deccae3e8a6bca0018c8e7a3c4a93afecc502acd1e26756f2054f157f1c0edd939d
Also:
1. Have CChainState::LoadMempool and ::ThreadImport take in paths and
pass it through untouched to LoadMempool.
2. Make LoadMempool exit early if the load_path is empty.
3. Adjust the call to ::ThreadImport in ::AppInitMain to correctly pass
in an empty path if mempool persistence is disabled.
Not only does this increase coverage, it is also more correct in that
when ::LoadMempool is called with a mempool and chainstate, it calls
AcceptToMemoryPool with just the chainstate.
AcceptToMemoryPool will then act on the chainstate's mempool via
CChainState::GetMempool, which may be different from the mempool
originally passed to ::LoadMempool. (In this fuzz test's case, it
definitely is different)
Also, move DummyChainstate to its own file since it's now used by the
validation_load_mempool fuzz test to replace CChainState's m_mempool.
Also introduce node::{ShouldPersistMempool,MempoolPath} helper functions
in node/mempool_persist_args.{h,cpp} which are used by non-kernel
DumpMempool callers to determine whether or not to automatically dump
the mempool and where to dump it to.
- Store the mempool expiry (-mempoolexpiry) in CTxMemPool as a
std::chrono::seconds member.
- Remove the requirement to explicitly specify a mempool expiry for
LimitMempoolSize(...), just use the newly-introduced member.
- Remove all now-unnecessary instances of:
std::chrono::hours{gArgs.GetIntArg("-mempoolexpiry", DEFAULT_MEMPOOL_EXPIRY)}
ce893c0497 doc: Update developer notes (Anthony Towns)
d2852917ee sync.h: Imply negative assertions when calling LOCK (Anthony Towns)
bba87c0553 scripted-diff: Convert global Mutexes to GlobalMutexes (Anthony Towns)
a559509a0b sync.h: Add GlobalMutex type (Anthony Towns)
be6aa72f9f qt/clientmodel: thread safety annotation for m_cached_tip_mutex (Anthony Towns)
f24bd45b37 net_processing: thread safety annotation for m_tx_relay_mutex (Anthony Towns)
Pull request description:
This changes `LOCK(mutex)` for non-global, non-recursive mutexes to be annotated with the negative capability for the mutex it refers to, to prevent . clang applies negative capabilities recursively, so this helps avoid forgetting to annotate functions.
This can't reasonably be used for globals, because clang would require every function to be annotated with `EXCLUSIVE_LOCKS_REQUIRED(!g_mutex)` for each global mutex; so this introduces a trivial `GlobalMutex` subclass of `Mutex`, and reduces the annotations for both `GlobalMutex` to `LOCKS_EXCLUDED` which only catches trivial errors (eg (`LOCK(x); LOCK(x);`).
ACKs for top commit:
MarcoFalke:
review ACK ce893c0497🐦
hebasto:
ACK ce893c0497
Tree-SHA512: 5c35e8c7677ce3d994a7e3774f4344adad496223a51b3a1d1d3b5f20684b2e1d5cff688eb3fbc8d33e1b9940dfa76e515f9434e21de6f3ce3c935e29a319f529
We want m_chainparams to be alive for the duration of
ChainstateManager's lifetime since ChainstateManager's behaviour depends
on m_chainparams.
We could allow for a std::shared_ptr to be passed in as m_chainparams,
but that complicates things further. Given that CChainParams is not an
entity class or struct, we can just copy it and have ChainstateManager
own it.
This decouples validation.cpp from netaddress.cpp (transitively,
timedata.cpp, and asmap.cpp).
This is important for libbitcoinkernel as:
- There is no reason for the consensus engine to be coupled with
netaddress, timedata, and asmap
- Users of libbitcoinkernel can now easily supply their own
std::function that provides the adjusted time.
See the src/Makefile.am changes for some satisfying removals.
[META] Although it seems like we don't need it for just one option,
we're going to introduce another member to this struct *in the
next commit*. In future patchsets for libbitcoinkernel decoupling
it from ArgsManager, even more members will be added here.
f64aa9c411 Disallow more unsafe string->path conversions allowed by path append operators (Ryan Ofsky)
Pull request description:
Add more `fs::path` `operator/` and `operator+` overloads to prevent unsafe string->path conversions on Windows that would cause strings to be decoded according to the current Windows locale & code page instead of the correct string encoding.
Update application code to deal with loss of implicit string->path conversions by calling `fs::u8path` or `fs::PathFromString` explicitly, or by just changing variable types from `std::string` to `fs::path` to avoid conversions altogether, or make them happen earlier.
In all cases, there's no change in behavior either (1) because strings only contained ASCII characters and would be decoded the same regardless of what encoding was used, or (2) because of the 1:1 mapping between paths and strings using the `PathToString` and `PathFromString` functions.
Motivation for this PR was just that I was experimenting with #24469 and noticed that operations like `fs::path / std::string` were allowed, and I thought it would be better not to allow them.
ACKs for top commit:
hebasto:
ACK f64aa9c411
Tree-SHA512: 944cce49ed51537ee7a35ea4ea7f5feaf0c8fff2fa67ee81ec5adebfd3dcbaf41b73eb35e49973d5f852620367f13506fd12a7a9b5ae3a7a0007414d5c9df50f
In previous commits in this patchset, we've made sure that every
Unload/UnloadBlockIndex member function resets its own members, and does
not reach out to globals.
This means that their corresponding classes' default destructors can now
replace them, and do an even more thorough job without the need to be
updated for every new member variable.
Therefore, we can remove them, and also remove UnloadBlockIndex since
that's not used anymore.
Unfortunately, chainstatemanager_loadblockindex relies on
CChainState::UnloadBlockIndex, so that needs to stay for now.