be55f545d5 move-only: Extract common/args and common/config.cpp from util/system (TheCharlatan)
Pull request description:
This pull request is part of the `libbitcoinkernel` project https://github.com/bitcoin/bitcoin/issues/24303https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". It is part of a series of patches splitting up the `util/system` files. Its preceding pull request is https://github.com/bitcoin/bitcoin/pull/27254.
The pull request contains an extraction of ArgsManager related functions from util/system into their own common/ file.
The background of this commit is an ongoing effort to decouple the libbitcoinkernel library from the ArgsManager. The ArgsManager belongs into the common library, since the kernel library should not depend on it. See [doc/design/libraries.md](https://github.com/bitcoin/bitcoin/blob/master/doc/design/libraries.md) for more information on this rationale.
ACKs for top commit:
MarcoFalke:
re-ACK be55f545d5🚲
ryanofsky:
Code review ACK be55f545d5. Just small cleanups since the last review.
hebasto:
ACK be55f545d5, I have reviewed the code and it looks OK, I agree it can be merged.
Tree-SHA512: 90eb03334af0155b823030b4f2ecf286d35058d700ee2ddbbaa445be19e31eb0fe982656f35bd14ecee3ad2c3d0db3746855cb8f3777eff7253713e42873e111
0076bed45e logging: log ASN when using `-asmap` (brunoerg)
9836c76ae0 net: add `GetMappedAS` in `CConnman` (brunoerg)
Pull request description:
When using `-asmap`, you can check the ASN assigned to the peers only with the RPC command `getpeerinfo` (check `mapped_as` field), however, it's not possible to check it in logs (e.g. see in logs the ASN of the peers when a new outbound peer has been connected). This PR includes the peers' ASN in debug output when using `-asmap`.
Obs: Open this primarily to chase some Concept ACK, I've been using this on my node to facilitate to track the peers' ASN especially when reading the logs.
ACKs for top commit:
Sjors:
tACK 0076bed45e
jamesob:
ACK 0076bed45e ([`jamesob/ackr/27412.1.brunoerg.logging_net_add_asn_from`](https://github.com/jamesob/bitcoin/tree/ackr/27412.1.brunoerg.logging_net_add_asn_from))
achow101:
ACK 0076bed45e
Tree-SHA512: c19cd11e8ab49962021f390459aadf6d33d221ae9a2c3df331a25d6865a8df470e2c8828f6e5219b8a887d6ab5b3450d34be9e26c00cca4d223b4ca64d51111b
This is an extraction of ArgsManager related functions from util/system
into their own common file.
Config file related functions are moved to common/config.cpp.
The background of this commit is an ongoing effort to decouple the
libbitcoinkernel library from the ArgsManager. The ArgsManager belongs
into the common library, since the kernel library should not depend on
it. See doc/design/libraries.md for more information on this rationale.
00e9b97f37 refactor: Move fs.* to util/fs.* (TheCharlatan)
106b46d9d2 Add missing fs.h includes (TheCharlatan)
b202b3dd63 Add missing cstddef include in assumptions.h (TheCharlatan)
18fb36367a refactor: Extract util/fs_helpers from util/system (Ben Woosley)
Pull request description:
This pull request is part of the `libbitcoinkernel` project https://github.com/bitcoin/bitcoin/issues/24303https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". This commit was originally authored by empact and is taken from its parent PR #25152.
#### Context
There is an ongoing effort to decouple the `ArgsManager` used for command line parsing user-provided arguments from the libbitcoinkernel library (https://github.com/bitcoin/bitcoin/pull/25290, https://github.com/bitcoin/bitcoin/pull/25487, https://github.com/bitcoin/bitcoin/pull/25527, https://github.com/bitcoin/bitcoin/pull/25862, https://github.com/bitcoin/bitcoin/pull/26177, and https://github.com/bitcoin/bitcoin/pull/27125). The `ArgsManager` is defined in `system.h`. A similar pull request extracting functionality from `system.h` has been merged in https://github.com/bitcoin/bitcoin/pull/27238.
#### Changes
Next to providing better code organization, this PR removes some reliance of the tree of libbitcoinkernel header includes on `system.h` (and thus the `ArgsManager` definition) by moving filesystem related functions out of the `system.*` files.
There is already a pair of `fs.h` / `fs.cpp` in the top-level `src/` directory. They were not combined with the files introduced here, to keep the patch cleaner and more importantly because they are often included without the utility functions. The new files are therefore named `fs_helpers` and the existing `fs` files are moved into the util directory.
Further commits splitting more functionality out of `system.h` are still in #25152 and will be submitted in separate PRs once this PR has been processed.
ACKs for top commit:
hebasto:
ACK 00e9b97f37
Tree-SHA512: 31422f148d14ba3c843b99b1550a6fd77c77f350905ca324f93d4f97b652246bc58fa9696c64d1201979cf88733e40be02d262739bb7d417cf22bf506fdb7666
The fs.* files are already part of the libbitcoin_util library. With the
introduction of the fs_helpers.* it makes sense to move fs.* into the
util/ directory as well.
Previously, we would make connections to peer from the netgroups to which
our MANUAL outbound connections belong.
However, they should be seen as regular connections from Addrman when it comes to netgroup diversity check, since the same rationale can be applied.
Note, this has nothing to do with how we connect to MANUAL connections:
we connect to them unconditionally.
error is a low-level function with a sole dependency on LogPrintf, which
is defined in logging.h
The background of this commit is an ongoing effort to decouple the
libbitcoinkernel library from the ArgsManager defined in system.h.
Moving the function out of system.h allows including it from a separate
source file without including the ArgsManager definitions from system.h.
3c1de032de i2p: use consistent number of tunnels with i2pd and Java I2P (Vasil Dimov)
801b405f85 i2p: lower the number of tunnels for transient sessions (Vasil Dimov)
b906b64eb7 i2p: reuse created I2P sessions if not used (Vasil Dimov)
Pull request description:
* Reuse an I2P transient session instead of discarding it if we failed to connect to the desired peer. This means we never used the generated address (destination), whose creation is not cheap. This does not mean that we will use the same address for more than one peer.
* Lower the number of tunnels for transient sessions.
* Explicitly specify the number of tunnels for persistent sessions instead of relying on the defaults which differ between I2P routers. This way we get consistent behavior with all routers.
Alleviates: https://github.com/bitcoin/bitcoin/issues/26754
(I have not tested this with i2pd, yet)
ACKs for top commit:
jonatack:
ACK 3c1de032de
mzumsande:
Light ACK 3c1de032de
Tree-SHA512: 477b4b9a5755e6a9a46bc0f7b268fa419dff4414e25445c750ae913f7552d9e2313f2aca4e3b70067b8390c2d0c2d68ec459f331765e939fc84139e454031cd4
2555a3950f p2p: ProcessAddrFetch(-seednode) is unnecessary if -connect is specified (Dhruv Mehta)
Pull request description:
If the user runs: `bitcoind -connect=X -seednode=Y`, I _think_ it is safe to ignore `-seednode`. A more populated `addrman` (via `getaddr` calls to peers in `-seednode`) is not useful in this configuration: `addrman` entries are used to initiate new outbound connections when slots are open, or to open feeler connections and keep `addrman` from getting stale. This is all done in a part of `ThreadOpenConnections` (below [this line](https://github.com/bitcoin/bitcoin/blob/master/src/net.cpp#L1803)) which is never executed when `-connect` is supplied. With `-connect`, `ThreadOpenConnections` will run [this loop](https://github.com/bitcoin/bitcoin/blob/master/src/net.cpp#L1785) and exit thread execution when interrupted.
Reviewers may also find it relevant that when `-connect` is used, we [soft disable](https://github.com/bitcoin/bitcoin/blob/master/src/init.cpp#L800) `-dnsseed` in init.cpp perhaps for the same reason i.e. seeding is not useful with `-connect`.
Running `ProcessAddrFetch` does not seem to have downside except developer confusion AFAICT. I was confused by this and felt it might affect other new bitcoiners too. If there is strong preference to not remove the line, I'd also be happy to just leave a comment there mentioning `ADDR_FETCH`/`-seednode` is irrelevant when used with `-connect`.
If this change is accepted, the node will still make `getaddr` calls to peers in `-connect` and expand `addrman`. However, disabling those `getaddr` calls would leak information about the node's configuration.
ACKs for top commit:
mzumsande:
Code Review ACK 2555a3950f
achow101:
ACK 2555a3950f
vasild:
ACK 2555a3950f
Tree-SHA512: 9187a0cff58db8edeca7e15379b1c121e7ebe8c38fb82f69e3dae8846ee94c92a329d79025e0f023c7579b2d86e7dbf756e4e30e90a72236bfcd2c00714180b3
c9d548c91f net: remove CService::ToStringPort() (Vasil Dimov)
fd4f0f41e9 gui: simplify OptionsDialog::updateDefaultProxyNets() (Vasil Dimov)
96c791dd20 net: remove CService::ToString() use ToStringAddrPort() instead (Vasil Dimov)
944a9de08a net: remove CNetAddr::ToString() and use ToStringAddr() instead (Vasil Dimov)
043b9de59a scripted-diff: rename ToStringIP[Port]() to ToStringAddr[Port]() (Vasil Dimov)
Pull request description:
Before this PR we had the somewhat confusing combination of methods:
`CNetAddr::ToStringIP()`
`CNetAddr::ToString()` (duplicate of the above)
`CService::ToStringIPPort()`
`CService::ToString()` (duplicate of the above, overrides a non-virtual method from `CNetAddr`)
`CService::ToStringPort()`
Avoid [overriding non-virtual methods](https://github.com/bitcoin/bitcoin/pull/25349/#issuecomment-1185226396).
"IP" stands for "Internet Protocol" and while sometimes "IP addresses" are called just "IPs", it is incorrect to call Tor or I2P addresses "IPs". Thus use "Addr" instead of "IP".
Change the above to:
`CNetAddr::ToStringAddr()`
`CService::ToStringAddrPort()`
The changes touch a lot of files, but are mostly mechanical.
ACKs for top commit:
sipa:
utACK c9d548c91f
achow101:
ACK c9d548c91f
jonatack:
re-ACK c9d548c91f only change since my previous reviews is rebase, but as a sanity check rebased to current master and at each commit quickly re-reviewed and re-verified clean build and green unit tests
LarryRuane:
ACK c9d548c91f
Tree-SHA512: 633fb044bdecf9f551b5e3314c385bf10e2b78e8027dc51ec324b66b018da35e5b01f3fbe6295bbc455ea1bcd1a3629de1918d28de510693afaf6a52693f2157
that was resolved in PR27036 "test: Remove last uses of snprintf and simplify"
and while here, fix up 2 words in docs to make the spelling linter green again.
691eaf8873 Pass MSG_MORE flag when sending non-final network messages (Matt Whitlock)
Pull request description:
**N.B.:** This is my second attempt at introducing this optimization. #12519 (2018) was closed in deference to switching to doing gathering socket writes using `sendmsg(2)`, which I agree would have superior performance due to fewer syscalls, but that work was apparently abandoned in late 2018. Ever since, Bitcoin Core has continued writing tons of runt packets to the wire. Can we proceed with my halfway solution for now?
----
Since Nagle's algorithm is disabled, each and every call to `send(2)` can potentially generate a separate TCP segment on the wire. This is especially inefficient when sending the tiny header preceding each message payload.
Linux implements a `MSG_MORE` flag that tells the kernel not to push the passed data immediately to the connected peer but rather to collect it in the socket's internal transmit buffer where it can be combined with data from successive calls to `send(2)`. Where available, specify this flag when calling `send(2)` in `CConnman::SocketSendData(CNode &)` if the data buffer being sent is not the last one in `node.vSendMsg`.
ACKs for top commit:
sipa:
ACK 691eaf8873
vasild:
ACK 691eaf8873
Tree-SHA512: 9a7f46bc12edbf78d488f05d1c46760110a24c95af74b627d2604fcd198fa3f511c5956bac36d0034e88c632d432f7d394147e667a11b027af0a30f70a546d70
dfc01ccd73 net: simplify the call to vProcessMsg.splice() (Vasil Dimov)
Pull request description:
At the time when
```cpp
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg, pnode->vRecvMsg.begin(), it);
```
is called, `it` is certainly `pnode->vRecvMsg.end()` which makes the call equivalent to:
```cpp
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg, pnode->vRecvMsg.begin(), pnode->vRecvMsg.end());
```
which is equivalent to:
```cpp
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg);
```
Thus, use the latter. Further, maybe irrelevant, but the latter has constant complexity while the original code is `O(length of vRecvMsg)`.
ACKs for top commit:
theStack:
Code-review ACK dfc01ccd73
MarcoFalke:
review ACK dfc01ccd73🐑
jonatack:
Light review ACK dfc01ccd73
Tree-SHA512: 9f4eb61d1caf4af9a61ba2f54b915fcfe406db62c58ab1ec42f736505b6792e9379a83d0458d6cc04f289edcec070b7c962f94a920ab51701c3cab103152866f
At the time when
```cpp
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg, pnode->vRecvMsg.begin(), it);
```
is called, `it` is certainly `pnode->vRecvMsg.end()` which makes the
call equivalent to:
```cpp
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg, pnode->vRecvMsg.begin(), pnode->vRecvMsg.end());
```
which is equivalent to:
```cpp
pnode->vProcessMsg.splice(pnode->vProcessMsg.end(), pnode->vRecvMsg);
```
Thus, use the latter. Further, maybe irrelevant, but the latter has
constant complexity while the original code is `O(length of vRecvMsg)`.
The functionality of the old size() is covered by the new Size()
when no arguments are specified, so this does not change behavior.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Previously, we'd only load fixed seeds if we'd not
know any addresses at all. This change makes it possible
to change -onlynet abruptly, e.g. from -onlynet=onion to
-onlynet=i2p and still find peers.
In the case of `i2pacceptincoming=0` we use transient addresses
(destinations) for ourselves for each outbound connection. It may
happen that we
* create the session (and thus our address/destination too)
* fail to connect to the particular peer (e.g. if they are offline)
* dispose the unused session.
This puts unnecessary load on the I2P network because session creation
is not cheap. Is exaggerated if `onlynet=i2p` is used in which case we
will be trying to connect to I2P peers more often.
To help with this, save the created but unused sessions and pick them
later instead of creating new ones.
Alleviates: https://github.com/bitcoin/bitcoin/issues/26754
Since Nagle's algorithm is disabled, each and every call to send(2) can potentially generate a separate TCP segment on the wire. This is especially inefficient when sending the tiny header preceding each message payload.
Linux implements a MSG_MORE flag that tells the kernel not to push the passed data immediately to the connected peer but rather to collect it in the socket's internal transmit buffer where it can be combined with data from successive calls to send(2). Where available, specify this flag when calling send(2) in CConnman::SocketSendData(CNode &) if the data buffer being sent is not the last one in node.vSendMsg.
Both methods do the same thing, so simplify to having just one.
`ToString()` is too generic in this case and it is unclear what it does,
given that there are similar methods:
`ToStringAddr()` (inherited from `CNetAddr`),
`ToStringPort()` and
`ToStringAddrPort()`.
Both methods do the same thing, so simplify to having just one.
Further, `CService` inherits `CNetAddr` and `CService::ToString()`
overrides `CNetAddr::ToString()` but the latter is not virtual which
may be confusing. Avoid such a confusion by not having non-virtual
methods with the same names in inheritance.
"IP" stands for "Internet Protocol".
"IP address" is sometimes shortened to just "IP" or "address".
However, Tor or I2P addresses are not "IP addresses", nor "IPs".
Thus, use "Addr" instead of "IP" for addresses that could be IP, Tor or
I2P addresses:
`CService::ToStringIPPort()` -> `CService::ToStringAddrPort()`
`CNetAddr::ToStringIP()` -> `CNetAddr::ToStringAddr()`
-BEGIN VERIFY SCRIPT-
sed -i 's/ToStringIPPort/ToStringAddrPort/g' -- $(git grep -l ToStringIPPort src)
sed -i 's/ToStringIP/ToStringAddr/g' -- $(git grep -l ToStringIP src)
-END VERIFY SCRIPT-
b527b54950 net: convert standalone SetSocketNonBlocking() to Sock::SetNonBlocking() (Vasil Dimov)
29f66f7682 moveonly: move SetSocketNonBlocking() from netbase to util/sock (Vasil Dimov)
b4bac55679 net: convert standalone IsSelectableSocket() to Sock::IsSelectable() (Vasil Dimov)
5db7d2ca0a moveonly: move IsSelectableSocket() from compat.h to sock.{h,cpp} (Vasil Dimov)
Pull request description:
_This is a piece of #21878, chopped off to ease review._
* convert standalone `IsSelectableSocket()` to `Sock::IsSelectable()`
* convert standalone `SetSocketNonBlocking()` to `Sock::SetNonBlocking()`
This further encapsulates syscalls inside the `Sock` class and makes the callers mockable.
ACKs for top commit:
jonatack:
ACK b527b54950 review/debug build/unit tests at each commit, cross-referenced the changes with `man select` and `man errno`, ran a signet node on the last commit with ip4/ip6//tor/i2p/cjdns and network connections were nominal
dergoegge:
Code review ACK b527b54950
Tree-SHA512: af783ce558c7a89e173f7ab323fb3517103d765c19b5d14de29f64706b4e1fea3653492e8ea73ae972699986aaddf2ae72c7cfaa7dad7614254283083b7d2632
9cbfe40d8a net: remove useless call to IsReachable() from CConnman::Bind() (Vasil Dimov)
Pull request description:
`CConnman::Bind()` is called without `BF_EXPLICIT` only when passed
either `0.0.0.0` or `::`. For those addresses `IsReachable()` is always
true (regardless of the `-onlynet=` setting!), meaning that the `if`
condition never evaluates to true.
`IsReachable()` is always true for the "any" IPv4 and IPv6 addresses
because `CNetAddr::GetNetwork()` returns `NET_UNROUTABLE` instead of
`NET_IPV4` or `NET_IPV6` and the network `NET_UNROUTABLE` is always
considered reachable.
It follows that `BF_EXPLICIT` is unnecessary, remove it too.
ACKs for top commit:
naumenkogs:
ACK 9cbfe40d8a
aureleoules:
ACK 9cbfe40d8a
mzumsande:
ACK 9cbfe40d8a
Tree-SHA512: 4e53ee8a73ddd133fd4ff25635135b65e5c19d1fc56fe5c30337406560664616c0adff414dca47602948919f34c81073aae6bfc2871509f3912663d86750928e
d575a675cc net_processing: add thread safety annotation for m_highest_fast_announce (Anthony Towns)
0ae7987f68 net_processing: add thread safety annotations for PeerManagerImpl members accessed only via the msgproc thread (Anthony Towns)
a66a7ccb82 net_processing: add thread safety annotations for Peer members accessed only via the msgproc thread (Anthony Towns)
bf12abe454 net: drop cs_sendProcessing (Anthony Towns)
1e78f566d5 net: add NetEventsInterface::g_msgproc_mutex (Anthony Towns)
Pull request description:
There are many cases where we assume message processing is single-threaded in order for how we access node-related memory to be safe. Add an explicit mutex that we can use to document this, which allows the compiler to catch any cases where we try to access that memory from other threads and break that assumption.
ACKs for top commit:
MarcoFalke:
review ACK d575a675cc 📽
dergoegge:
Code review ACK d575a675cc
w0xlt:
ACK d575a675cc
vasild:
ACK d575a675cc modulo the missing runtime checks
Tree-SHA512: b886d1aa4adf318ae64e32ccaf3d508dbb79d6eed3f1fa9d8b2ed96f3c72a3d38cd0f12e05826c9832a2a1302988adfd2b43ea9691aa844f37d8f5c37ff20e05
fa521c9603 Use steady clock for all millis bench logging (MacroFake)
Pull request description:
Currently `GetTimeMillis` is used for bench logging in milliseconds integral precision. Replace it to use a steady clock that is type-safe and steady.
Microsecond or float precision can be done in a follow-up.
ACKs for top commit:
fanquake:
ACK fa521c9603 - started making the same change.
Tree-SHA512: 86a810e496fc663f815acb8771a6c770331593715cde85370226685bc50c13e8e987e3c5efd0b4e48b36ebd2372255357b709204bac750d41e94a9f7d9897fa6
There are many cases where we assume message processing is
single-threaded in order for how we access node-related memory to be
safe. Add an explicit mutex that we can use to document this, which allows
the compiler to catch any cases where we try to access that memory from
other threads and break that assumption.
385f5a4c3f p2p: Don't query DNS seeds when both IPv4 and IPv6 are unreachable (Martin Zumsande)
91f0a7fbb7 p2p: add only reachable addresses to addrman (Martin Zumsande)
Pull request description:
Currently, `-onlynet` does not work well in connection with initial peer discovery, because DNS seeds only resolve to IPv6 and IPv4 adresses:
With `-onlynet=i2p`, we would load clearnet addresses from DNS seeds into addrman, be content our addrman isn't empty so we don't try to query hardcoded seeds (although these exist for i2p!), and never attempt to make an automatic outbound connection.
With `-onlynet=onion` and `-proxy` set, we wouldn't load addresses via DNS, but will make AddrFetch connections (through a tor exit node) to a random clearnet peer the DNS seed resolves to (see https://github.com/bitcoin/bitcoin/issues/6808#issuecomment-147652505), thus breaching the `-onlynet` preference of the user - this has been reported in the two issues listed below.
This PR proposes two changes:
1.) Don't load addresses that are unreachable (so that we wouldn't connect to them) into addrman. This is already the case for addresses received via p2p addr messages, this PR implements the same for addresses received from DNS seeds and fixed seeds. This means that in the case of `-onlynet=onion`, we wouldn't load fixed seed IPv4 addresses into addrman, only the onion ones.
2.) Skip trying the DNS seeds if neither IPv4 nor IPv6 are reachable and move directly to adding the hardcoded seeds from networks we can connect to. This is done by soft-setting `-dnsseed` to 0 in this case, unless `-dnsseed=1` was explicitly specified, in which case we abort with an `InitError`.
Fixes #6808
Fixes #12344
ACKs for top commit:
naumenkogs:
utACK 385f5a4c3f
vasild:
ACK 385f5a4c3f
Tree-SHA512: 33a8c29faccb2d9b937b017dba4ef72c10e05e458ccf258f1aed3893bcc37c2e984ec8de998d2ecfa54282abbf44a132e97d98bbcc24a0dcf1871566016a9b91
This happens, for example, if the user specified -onlynet=onion or
-onlynet=i2p. DNS seeds only resolve to IPv4 / IPv6 addresses,
making their answers useless to us, since we don't want to make
connections to these.
If, within the DNS seed thread, we'd instead do fallback AddrFetch
connections to one of the clearnet addresses the DNS seed resolves to,
we might get usable addresses from other networks
if lucky, but would be violating our -onlynet user preference
in doing so.
Therefore, in this case it is better to rely on fixed seeds for networks we
want to connect to.
Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
We will not make outgoing connection to peers that are unreachable
(e.g. because of -onlynet configuration).
Therefore, it makes no sense to add them to addrman in the first place.
While this is already the case for addresses received via p2p addr
messages, this commit does the same for addresses received
from fixed seeds.
Dereferencing a unique_ptr is not necessarily thread safe. The reason
these are safe is because their values are set at construction and do
not change later; so mark them as const and set them via the initializer
list to guarantee that.
The (V1)TransportSerializer instance CNode::m_serializer is used from
multiple threads via PushMessage without protection by a mutex. This
is only thread safe because the class does not have any mutable state,
so document that by marking the methods and the object as "const".