bc9283c441 [test] Add functional test to test early key response behaviour in BIP 324 (stratospher)
ffe6a56d75 [test] Check whether v2 TestNode performs downgrading (stratospher)
ba737358a3 [test] Add functional tests to test v2 P2P behaviour (stratospher)
4115cf9956 [test] Ignore BIP324 decoy messages (stratospher)
8c054aa04d [test] Allow inbound and outbound connections supporting v2 P2P protocol (stratospher)
382894c3ac [test] Reconnect using v1 P2P when v2 P2P terminates due to magic byte mismatch (stratospher)
a94e350ac0 [test] Build v2 P2P messages (stratospher)
bb7bffed79 [test] Use lock for sending P2P messages in test framework (stratospher)
5b91fb14ab [test] Read v2 P2P messages (stratospher)
05bddb20f5 [test] Perform initial v2 handshake (stratospher)
a049d1bd08 [test] Introduce EncryptedP2PState object in P2PConnection (stratospher)
b89fa59e71 [test] Construct class to handle v2 P2P protocol functions (stratospher)
8d6c848a48 [test] Move MAGIC_BYTES to messages.py (stratospher)
595ad4b168 [test/crypto] Add ECDH (stratospher)
4487b80517 [rpc/net] Allow v2 p2p support in addconnection (stratospher)
Pull request description:
This PR introduces support for v2 P2P encryption(BIP 324) in the existing functional test framework and adds functional tests for the same.
### commits overview
1. introduces a new class `EncryptedP2PState` to store the keys, functions for performing the initial v2 handshake and encryption/decryption.
3. this class is used by `P2PConnection` in inbound/outbound connections to perform the initial v2 handshake before the v1 version handshake. Only after the initial v2 handshake is performed do application layer P2P messages(version, verack etc..) get exchanged. (in a v2 connection)
- `v2_state` is the object of class `EncryptedP2PState` in `P2PConnection` used to store its keys, session-id etc.
- a node [advertising](https://github.com/stratospher/blogosphere/blob/main/integration_test_bip324.md#advertising-to-support-v2-p2p) support for v2 P2P is different from a node actually [supporting v2 P2P](https://github.com/stratospher/blogosphere/blob/main/integration_test_bip324.md#supporting-v2-p2p) (differ when false advertisement of services occur)
- introduce a boolean variable `supports_v2_p2p` in `P2PConnection` to denote if it supports v2 P2P.
- introduce a boolean variable `advertises_v2_p2p` to denote whether `P2PConnection` which mimics peer behaviour advertises V2 P2P support. Default option is `False`.
- In the test framework, you can create Inbound and Outbound connections to `TestNode`
1. During **Inbound Connections**, `P2PConnection` is the initiator [`TestNode` <--------- `P2PConnection`]
- Case 1:
- if the `TestNode` advertises/signals v2 P2P support (means `self.nodes[i]` set up with `"-v2transport=1"`), different behaviour will be exhibited based on whether:
1. `P2PConnection` supports v2 P2P
2. `P2PConnection` does not support v2 P2P
- In a real world scenario, the initiator node would intrinsically know if they support v2 P2P based on whatever code they choose to run. However, in the test scenario where we mimic peer behaviour, we have no way of knowing if `P2PConnection` should support v2 P2P or not. So `supports_v2_p2p` boolean variable is used as an option to enable support for v2 P2P in `P2PConnection`.
- Since the `TestNode` advertises v2 P2P support (using "-v2transport=1"), our initiator `P2PConnection` would send:
1. (if the `P2PConnection` supports v2 P2P) ellswift + garbage bytes to initiate the connection
2. (if the `P2PConnection` does not support v2 P2P) version message to initiate the connection
- Case 2:
- if the `TestNode` doesn't signal v2 P2P support; `P2PConnection` being the initiator would send version message to initiate a connection.
2. During **Outbound Connections** [TestNode --------> P2PConnection]
- initiator `TestNode` would send:
- (if the `P2PConnection` advertises v2 P2P) ellswift + garbage bytes to initiate the connection
- (if the `P2PConnection` advertises v2 P2P) version message to initiate the connection
- Suppose `P2PConnection` advertises v2 P2P support when it actually doesn't support v2 P2P (false advertisement scenario)
- `TestNode` sends ellswift + garbage bytes
- `P2PConnection` receives but can't process it and disconnects.
- `TestNode` then tries using v1 P2P and sends version message
- `P2PConnection` receives/processes this successfully and they communicate on v1 P2P
4. the encrypted P2P messages follow a different format - 3 byte length + 1-13 byte message_type + payload + 16 byte MAC
5. includes support for testing decoy messages and v2 connection downgrade(using false advertisement - when a v2 node makes an outbound connection to a node which doesn't support v2 but is advertised as v2 by some malicious
intermediary)
### run the tests
* functional test - `test/functional/p2p_v2_encrypted.py` `test/functional/p2p_v2_earlykeyresponse.py`
I'm also super grateful to @ dhruv for his really valuable feedback on this branch.
Also written a more elaborate explanation here - https://github.com/stratospher/blogosphere/blob/main/integration_test_bip324.md
ACKs for top commit:
naumenkogs:
ACK bc9283c441
mzumsande:
Code Review ACK bc9283c441
theStack:
Code-review ACK bc9283c441
glozow:
ACK bc9283c441
Tree-SHA512: 9b54ed27e925e1775e0e0d35e959cdbf2a9a1aab7bcf5d027e66f8b59780bdd0458a7a4311ddc7dd67657a4a2a2cd5034ead75524420d58a83f642a8304c9811
This test-only RPC is required when a TestNode initiates
an outbound v2 p2p connection. Add a new arg `v2transport`
so that the node can attempt v2 connections.
This affects manual connections made either with -connect, or with
-addnode provided as a bitcoind config arg (the addnode RPC has an
extra option for v2).
We don't necessarily know if our peer supports v2, but will reconnect
with v1 if they don't. In order to do that, improve the reconnection
behavior such that we will reconnect after a sleep of 500ms
(which usually should be enough for our peer to send us their
version message).
3ea54e5db7 net: Add continuous ASMap health check logging (Fabian Jahr)
28d7e55dff test: Add tests for unfiltered GetAddr usage (Fabian Jahr)
b8843d37ae fuzz: Let fuzzers use filter options in GetAddr/GetAddresses (Fabian Jahr)
e16f420547 net: Optionally include terrible addresses in GetAddr results (Fabian Jahr)
Pull request description:
There are certain statistics we can collect by running all our known clearnet addresses against the ASMap file. This could show issues with a maliciously manipulated file or with an old file that has decayed with time.
This is just a proof of concept for now. My idea currently is to run the analysis once per day and print the results to logs if an ASMap file is used.
ACKs for top commit:
achow101:
ACK 3ea54e5db7
mzumsande:
ACK 3ea54e5db7
brunoerg:
crACK 3ea54e5db7
Tree-SHA512: 777acbfac43cc43ce4a0a3612434e4ddbc65f59ae8ffc9e24f21de09011bccb297f0599cbaa82bcf40ef68e5af582c4e98556379db7ceff7d9f97574a1cf8e09
fa79a881ce refactor: P2P transport without serialize version and type (MarcoFalke)
fa9b5f4fe3 refactor: NetMsg::Make() without nVersion (MarcoFalke)
66669da4a5 Remove unused Make() overload in netmessagemaker.h (MarcoFalke)
fa0ed07941 refactor: VectorWriter without nVersion (MarcoFalke)
Pull request description:
Now that the serialize framework ignores the serialize version and serialize type, everything related to it can be removed from the code.
This is the first step, removing dead code from the P2P stack. A different pull will remove it from the wallet and other parts.
ACKs for top commit:
ajtowns:
reACK fa79a881ce
Tree-SHA512: 785b413580d980f51f0d4f70ea5e0a99ce14cd12cb065393de2f5254891be94a14f4266110c8b87bd2dbc37467676655bce13bdb295ab139749fcd8b61bd5110
5e7cc4144b test: add unit test for CConnman::AddedNodesContain() (Jon Atack)
cc62716920 p2p: do not make automatic outbound connections to addnode peers (Jon Atack)
Pull request description:
to allocate our limited outbound slots correctly, and to ensure addnode
connections benefit from their intended protections.
Our addnode logic usually connects the addnode peers before the automatic
outbound logic does, but not always, as a connection race can occur. If an
addnode peer disconnects us and if it was the only one from its network, there
can be a race between reconnecting to it with the addnode thread, and it being
picked as automatic network-specific outbound peer. Or our internet connection
or router or the addnode peer could be temporarily offline, and then return
online during the automatic outbound thread. Or we could add a new manual peer
using the addnode RPC at that time.
The race can be more apparent when our node doesn't know many peers, or with
networks like cjdns that currently have few bitcoin peers.
When an addnode peer is connected as an automatic outbound peer and is the only
connection we have to a network, it can be protected by our new outbound
eviction logic and persist in the "wrong role".
Finally, there does not seem to be a reason to make block-relay or short-lived
feeler connections to addnode peers, as the addnode logic will ensure we connect
to them if they are up, within the addnode connection limit.
Fix these issues by checking if the address is an addnode peer in our automatic
outbound connection logic.
ACKs for top commit:
mzumsande:
Tested ACK 5e7cc4144b
brunoerg:
utACK 5e7cc4144b
vasild:
ACK 5e7cc4144b
guggero:
utACK 5e7cc4144b
Tree-SHA512: 2438c3ec92e98aebca2a0da960534e4655a9c6e1192a24a085fc01326d95cdb1b67d8c44e4ee706bc1d8af8564126d446a21b5579dcbec61bdea5fce2f0115ee
to allocate our limited outbound slots correctly, and to ensure addnode
connections benefit from their intended protections.
Our addnode logic usually connects the addnode peers before the automatic
outbound logic does, but not always, as a connection race can occur. If an
addnode peer disconnects us and if it was the only one from its network, there
can be a race between reconnecting to it with the addnode thread, and it being
picked as automatic network-specific outbound peer. Or our internet connection
or router, or the addnode peer, could be temporarily offline, and then return
online during the automatic outbound thread. Or we could add a new manual peer
using the addnode RPC at that time.
The race can be more apparent when our node doesn't know many peers, or with
networks like cjdns that currently have few bitcoin peers.
When an addnode peer is connected as an automatic outbound peer and is the only
connection we have to a network, it can be protected by our new outbound
eviction logic and persist in the "wrong role".
Examples on mainnet using logging added in the same pull request:
2023-08-12T14:51:05.681743Z [opencon] [net.cpp:1949] [ThreadOpenConnections]
[net:debug] Not making automatic network-specific outbound-full-relay connection
to i2p peer selected for manual (addnode) connection: [geh...odq.b32.i2p]:0
2023-08-13T03:59:28.050853Z [opencon] [net.cpp:1949] [ThreadOpenConnections]
[net:debug] Not making automatic block-relay-only connection to onion peer
selected for manual (addnode) connection: kpg...aid.onion:8333
2023-08-13T16:21:26.979052Z [opencon] [net.cpp:1949] [ThreadOpenConnections]
[net:debug] Not making automatic network-specific outbound-full-relay connection
to cjdns peer selected for manual (addnode) connection: [fcc...8ce]:8333
2023-08-14T20:43:53.401271Z [opencon] [net.cpp:1949] [ThreadOpenConnections]
[net:debug] Not making automatic network-specific outbound-full-relay connection
to cjdns peer selected for manual (addnode) connection: [fc7...59e]:8333
2023-08-15T00:10:01.894147Z [opencon] [net.cpp:1949] [ThreadOpenConnections]
[net:debug] Not making automatic feeler connection to i2p peer selected for
manual (addnode) connection: geh...odq.b32.i2p:8333
Finally, there does not seem to be a reason to make block-relay or short-lived
feeler connections to addnode peers, as the addnode logic will ensure we connect
to them if they are up, within the addnode connection limit.
Fix these issues by checking if the address is an addnode peer in our automatic
outbound connection logic.
0420f99f42 Create net_peer_connection unit tests (Jon Atack)
4b834f6499 Allow unit tests to access additional CConnman members (Jon Atack)
34b9ef443b net/rpc: Makes CConnman::GetAddedNodeInfo able to return only non-connected address on request (Sergi Delgado Segura)
94e8882d82 rpc: Prevents adding the same ip more than once when formatted differently (Sergi Delgado Segura)
2574b7e177 net/rpc: Check all resolved addresses in ConnectNode rather than just one (Sergi Delgado Segura)
Pull request description:
## Rationale
Currently, `addnode` has a couple of corner cases that allow it to either connect to the same peer more than once, hence wasting outbound connection slots, or add redundant information to `m_added_nodes`, hence making Bitcoin iterate through useless data on a regular basis.
### Connecting to the same node more than once
In general, connecting to the same node more than once is something we should try to prevent. Currently, this is possible via `addnode` in two different ways:
1. Calling `addnode` more than once in a short time period, using two equivalent but distinct addresses
2. Calling `addnode add` using an IP, and `addnode onetry` after with an address that resolved to the same IP
For the former, the issue boils down to `CConnman::ThreadOpenAddedConnections` calling `CConnman::GetAddedNodeInfo` once, and iterating over the result to open connections (`CConman::OpenNetworkConnection`) on the same loop for all addresses.`CConnman::ConnectNode` only checks a single address, at random, when resolving from a hostname, and uses it to check whether we are already connected to it.
An example to test this would be calling:
```
bitcoin-cli addnode "127.0.0.1:port" add
bitcoin-cli addnode "localhost:port" add
```
And check how it allows us to perform both connections some times, and some times it fails.
The latter boils down to the same issue, but takes advantage of `onetry` bypassing the `CConnman::ThreadOpenAddedConnections` logic and calling `CConnman::OpenNetworkConnection` straightaway. A way to test this would be:
```
bitcoin-cli addnode "127.0.0.1:port" add
bitcoin-cli addnode "localhost:port" onetry
```
### Adding the same peer with two different, yet equivalent, addresses
The current implementation of `addnode` is pretty naive when checking what data is added to `m_added_nodes`. Given the collection stores strings, the checks at `CConnman::AddNode()` basically check wether the exact provided string is already in the collection. If so, the data is rejected, otherwise, it is accepted. However, ips can be formatted in several ways that would bypass those checks.
Two examples would be `127.0.0.1` being equal to `127.1` and `[::1]` being equal to `[0:0:0:0:0:0:0:1]`. Adding any pair of these will be allowed by the rpc command, and both will be reported as connected by `getaddednodeinfo`, given they map to the same `CService`.
This is less severe than the previous issue, since even tough both nodes are reported as connected by `getaddednodeinfo`, there is only a single connection to them (as properly reported by `getpeerinfo`). However, this adds redundant data to `m_added_nodes`, which is undesirable.
### Parametrize `CConnman::GetAddedNodeInfo`
Finally, this PR also parametrizes `CConnman::GetAddedNodeInfo` so it returns either all added nodes info, or only info about the nodes we are **not** connected to. This method is used both for `rpc`, in `getaddednodeinfo`, in which we are reporting all data to the user, so the former applies, and to check what nodes we are not connected to, in `CConnman::ThreadOpenAddedConnections`, in which we are currently returning more data than needed and then actively filtering using `CService.fConnected()`
ACKs for top commit:
jonatack:
re-ACK 0420f99f42
kashifs:
> > tACK [0420f9](0420f99f42)
sr-gi:
> > > tACK [0420f9](0420f99f42)
mzumsande:
Tested ACK 0420f99f42
Tree-SHA512: a3a10e748c12d98d439dfb193c75bc8d9486717cda5f41560f5c0ace1baef523d001d5e7eabac9fa466a9159a30bb925cc1327c2d6c4efb89dcaf54e176d1752
df69b22f2e doc: improve documentation around connection limit maximums (Amiti Uttarwar)
adc171edf4 scripted-diff: Rename connection limit variables (Amiti Uttarwar)
e9fd9c0225 net: add m_max_inbound to connman (Amiti Uttarwar)
c25e0e0555 net, refactor: move calculations for connection type limits into connman (Amiti Uttarwar)
Pull request description:
This is joint work with amitiuttarwar.
This has the first few commits of #28463. It is not strictly a prerequisite for that, but has changes that in our opinion make sense on their own.
It improves the handling of maximum numbers for different connection types (that are set during init and don’t change after) by:
* moving all calculations into one place, `CConnMan::Init()`. Before, they were dispersed between `Init`, `CConnman::Init` and other parts of `CConnman`, resulting in some duplicated test code.
* removing the possibility of having a negative maximum of inbound connections, which is hard to argue about
* renaming of variables and doc improvements
ACKs for top commit:
amitiuttarwar:
co-author review ACK df69b22f2e
naumenkogs:
ACK df69b22f2e
achow101:
ACK df69b22f2e
Tree-SHA512: 913d56136bc1df739978de50db67302f88bac2a9d34748ae96763288d97093e998fc0f94f9b6eff12867712d7e86225af6128f4170bf2b5b8ab76f024870a22c
`CConnman::GetAddedNodeInfo` is used both to get a list of addresses to manually connect to
in `CConnman::ThreadOpenAddedConnections`, and to report about manually added connections in
`getaddednodeinfo`. In both cases, all addresses added to `m_added_nodes` are returned, however
the nodes we are already connected to are only relevant to the latter, in the former they are
actively discarded.
Parametrizes `CConnman::GetAddedNodeInfo` so we can ask for only addresses we are not connected to,
to avoid passing useless information around.
`vfLimited`, `IsReachable()`, `SetReachable()` need not be in the `net`
module. Move them to `netbase` because they will be needed in
`LookupSubNet()` to possibly flip the result to CJDNS (if that network
is reachable).
In the process, encapsulate them in a class.
`NET_UNROUTABLE` and `NET_INTERNAL` are no longer ignored when adding
or removing reachable networks. This was unnecessary.
A "version" message in the V1 protocol starts with a fixed 16 bytes:
* The 4-byte network magic
* The 12-byte zero-padded command "version" plus 5 0x00 bytes
The current code detects incoming V1 connections by just looking at the
first 12 bytes (matching an earlier version of BIP324), but 16 bytes is
more precise. This isn't an observable difference right now, as a 12 byte
prefix ought to be negligible already, but it may become observable with
future extensions to the protocol, so make the code match the
specification.
Extract the logic for calculating & maintaining inbound connection limits to be
a member within connman for consistency with other maximum connection limits.
Note that we now limit m_max_inbound to 0 and don't call
AttemptToEvictConnection() when we don't have any inbounds.
Previously, nMaxInbound could become negative if the user ran with a low
-maxconnections, which didn't break any logic but didn't make sense.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Currently the logic is fragmented between init and connman. Encapsulating this
logic within connman allows for less mental overhead and easier reuse in tests.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
When an outbound v2 connection is disconnected without receiving anything, but at
least 24 bytes of our pubkey were sent out (enough to constitute an invalid v1
header), add them to a queue of reconnections to be tried.
The reconnections are in a queue rather than performed immediately, because we should
not block the socket handler thread with connection creation (a blocking operation
that can take multiple seconds).
See also https://github.com/bitcoin/bips/pull/1498
The benefit is a simpler implementation:
- The protocol state machine does not need separate states for garbage
authentication and version phases.
- The special case of "ignoring the ignore bit" is removed.
- The freedom to choose the contents of the garbage authentication
packet is removed. This simplifies testing.
Move functions requiring the netaddress.h include out of
libbitcoinkernel source files.
The netaddress.h file contains many non-consensus related definitions
and should thus not be part of the libbitcoinkernel. This commit makes
netaddress.h no longer a required include for users of the
libbitcoinkernel.
This commit is part of the libbitcoinkernel project, namely its stage 1
step 3: Decouple most non-consensus headers from libbitcoinkernel.
The protocol.h file contains many non-consensus related definitions and
should thus not be part of the libbitcoinkernel. This commit makes
protocol.h no longer a required include for users of the
libbitcoinkernel.
This commit is part of the libbitcoinkernel project, namely its stage 1
step 3: Decouple most non-consensus headers from libbitcoinkernel.
Co-Authored-By: Cory Fields <cory-nospam-@coryfields.com>
Before this commit the V2Transport::m_send_buffer is used to store the
garbage:
* During MAYBE_V1 state, it's there despite not being sent.
* During AWAITING_KEY state, while it is being sent.
* At the end of the AWAITING_KEY state it cannot be wiped as it's still
needed to compute the garbage authentication packet.
Change this by introducing a separate m_send_garbage field, taking over
the first and last role listed above. This means the garbage is only in
the send buffer when it's actually being sent, removing a few special
cases related to this.
This introduces a V2Transport with a basic subset of BIP324 functionality:
* no ability to send garbage (but receiving is supported)
* no ability to send decoy packets (but receiving them is supported)
* no support for short message id encoding (neither encoding or decoding)
* no waiting until 12 non-V1 bytes have been received
* (and thus) no detection of V1 connections on the responder side
(on the sender side, detecting V1 is not supported either, but that needs
to be dealt with at a higher layer, by reconnecting)
Before this commit, there are only two possibly outcomes for the "more" prediction
in Transport::GetBytesToSend():
* true: the transport itself has more to send, so the answer is certainly yes.
* false: the transport has nothing further to send, but if vSendMsg has more message(s)
left, that still will result in more wire bytes after the next
SetMessageToSend().
For the BIP324 v2 transport, there will arguably be a third state:
* definitely not: the transport has nothing further to send, but even if vSendMsg has
more messages left, they can't be sent (right now). This happens
before the handshake is complete.
To implement this, we move the entire decision logic to the Transport, by adding a
boolean to GetBytesToSend(), called have_next_message, which informs the transport
whether more messages are available. The return values are still true and false, but
they mean "definitely yes" and "definitely no", rather than "yes" and "maybe".
This furthers transport abstraction by removing the assumption that a message
can always immediately be converted to wire bytes. This assumption does not hold
for the v2 transport proposed by BIP324, as no messages can be sent before the
handshake completes.
This is done by only keeping (complete) CSerializedNetMsg objects in vSendMsg,
rather than the resulting bytes (for header and payload) that need to be sent.
In SocketSendData, these objects are handed to the transport as permitted by it,
and sending out the bytes the transport tells us to send. This also removes the
nSendOffset member variable in CNode, as keeping track of how much has been sent
is now a responsability of the transport.
This is not a pure refactor, and has the following effects even for the current
v1 transport:
* Checksum calculation now happens in SocketSendData rather than PushMessage.
For non-optimistic-send messages, that means this computation now happens in
the network thread rather than the message handler thread (generally a good
thing, as the message handler thread is more of a computational bottleneck).
* Checksum calculation now happens while holding the cs_vSend lock. This is
technically unnecessary for the v1 transport, as messages are encoded
independent from one another, but is untenable for the v2 transport anyway.
* Statistics updates about per-message sent bytes now happen when those bytes
are actually handed to the OS, rather than at PushMessage time.
This more accurately captures the intent of limiting send buffer size, as
many small messages can have a larger overhead that is not counted with the
current approach.
It also means removing the dependency on the header size (which will become
a function of the transport choice) from the send buffer calculations.
The rest of net.cpp already uses Params() to determine chainparams in many
places (and even V1Transport itself does so in some places).
Since the only chainparams dependency is through the message start characters,
just store those directly in the transport.
This makes the sending side of P2P transports mirror the receiver side: caller provides
message (consisting of type and payload) to be sent, and then asks what bytes must be
sent. Once the message has been fully sent, a new message can be provided.
This removes the assumption that P2P serialization of messages follows a strict structure
of header (a function of type and payload), followed by (unmodified) payload, and instead
lets transports decide the structure themselves.
It also removes the assumption that a message must always be sent at once, or that no
bytes are even sent on the wire when there is no message. This opens the door for
supporting traffic shaping mechanisms in the future.
Now that the Transport class deals with both the sending and receiving side
of things, make the receive side have function names that clearly indicate
they're about receiving.
* Transport::Read() -> Transport::ReceivedBytes()
* Transport::Complete() -> Transport::ReceivedMessageComplete()
* Transport::GetMessage() -> Transport::GetReceivedMessage()
* Transport::SetVersion() -> Transport::SetReceiveVersion()
Further, also update the comments on these functions to (among others) remove
the "deserialization" terminology. That term is better reserved for just the
serialization/deserialization between objects and bytes (see serialize.h), and
not the conversion from/to wire bytes as performed by the Transport.