feat: introduce p2p networking stack#420
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
abailly
left a comment
There was a problem hiding this comment.
This looks surprisingly simple and I think this would fit into pure_stage framework, but we would need a bit of experimenting to see how that would work. Is pallas2 in a workable stage so that we can experiment integrating this code in a real environment?
also, can you point us at other example use of the framework to get some insights on how far it goes and how to handle the subtle cases you highlighted in comments?
| } | ||
|
|
||
| #[derive(Default)] | ||
| pub struct WorkUnit { |
There was a problem hiding this comment.
Instead of stitching together Options would it not make sense to have a composable enum, something like:
pub enum WorkUnit {
Network(InitiatorCommand),
Sync(ChainSyncEvent),
Block(BlockBodyData),
More(Box<WorkUnit>)
}
There was a problem hiding this comment.
The problem is that commands and data dispatch are orthogonal to each other. In any given tick you might want to send a network command AND dispatch data. Eg: after a header, you will want to dispatch the header and continue with the sync.
There was a problem hiding this comment.
I now see that you have a continuation in your enum example, that could work to concatenate both actions.
Personally, I find the other struct more straight forward, but feel free to change it if you opt for the other one.
| #[derive(Stage)] | ||
| // TODO: revisit stage name. I would prefer something like "network", "p2p", etc. I'll leave the | ||
| // decision for maintainers since it might affect tracing. | ||
| #[stage(name = "stage.chain_sync_client", unit = "WorkUnit", worker = "Worker")] |
There was a problem hiding this comment.
stage.network.upstream would seems to be a good fit
| } | ||
|
|
||
| impl Worker { | ||
| fn handle_network(&self, event: Option<InitiatorEvent>) -> WorkSchedule<WorkUnit> { |
There was a problem hiding this comment.
passing an Option<> here feels a bit weird, why not avoid calling the function altogether if we don't have an event to handle?
There was a problem hiding this comment.
Yeah, I kind of agree. I did this to improve readability on the select! branches and emphasize the fact that returning idle for an empty event is a business logic decision (as opposed to glue code between functions).
One could make use of empty network ticks to provide metrics about the absence of network activity (eg: a counter of idle event).
Having said that, feel free to change it if the alternative feels more suitable.
| // continuation of the mini-protocol is unambiguous. If the fetch request involved more | ||
| // blocks, the network machinery will keep receiving and notifying us of new | ||
| // block bodies. | ||
| InitiatorEvent::BlockBodyReceived(pid, body) => { |
There was a problem hiding this comment.
What's pid here? the peerId? This would seem useful to correlate with the request
There was a problem hiding this comment.
Or maybe not if don't care where the block body came from.
There was a problem hiding this comment.
pid is short for peer_id. Saved me thousands of keystroke when building the network crate :)
Almost all network events have a pid tag. Consensus will certainly need pid on header events, but probably not for block fetch. It could be useful if you want to keep peer-level stats (eg: blocks received per peer, latency per peer, etc)
|
|
||
| fn handle_request(&self, request: &DownstreamRequest) -> WorkSchedule<WorkUnit> { | ||
| match request { | ||
| DownstreamRequest::FetchBlock(range) => { |
There was a problem hiding this comment.
How do we know which peer to send the request to? This seems important because there are some security implications in a peer not being able to deliver a block it's supposed to have (eg. they sent a header but fail to deliver block)
There was a problem hiding this comment.
The assumption here is that choosing a particular peer from the hot peers is a low-level detail that could be hidden from the consumer. If this design has implication on consensus security, we can certainly modify the api so that the specific peer is specified by the caller.
|
The Not battle tested, not the best test coverage, but ready for beta. Although dev is still ongoing, my plan is to deploy it on some low-risk environments to start gaining some real-usage feedback. This example contains a working version of a naive node doing chainsync + blockfetch (plus all of the low-level: peer discovery, handshake, keep-alive, etc). I don't have examples for the subtleties (back-pressure, config sweet spot, etc) that I mention in the comments; mainly because they depend on the architecture of the consumer, and we don't have many consumers ATM. |
|
Thanks a lot for the detailed answered @scarmuega, that's an already a great start! Another question: How does this all work on the server side, eg. to serve downstream peers? |
|
Thanks @scarmuega, this is a nice improvement that will make it much easier to integrate with different execution models (like pure-stage). While studying the code of pallas_network2 I found some aspects that I’d like to ask about. Initiator / ResponderCurrently only the initiator role is implemented, with only the initiator sides of the respective miniprotocols. What is your vision for how to offer the responder sides of the protocols and also the responder role of the whole endpoint? Reading the network spec §1.4.1 it seems that the initiator might also run the responder side of the miniprotocols, so would you have a separate ResponderBehavior or extend the InitiatorBehavior such that it serves both parts and should be renamed to N2NBehavior or similar? On this note: we’ll need to have the responder functionality ready before we can switch from pallas_network to pallas_network2. Flow controlIt seems to me that the flow control mechanism described in §2.1.2 and §2.1.3 has not been implemented yet, correct? In particular, the per-channel receive buffers might grow past fixed bounds unless I’m missing something. It seems difficult to add the intended behaviour of disconnecting the node upon TCP back pressure within the chosen implementation approach. Unbounded buffersThe use of FuturesUnordered to collect outstanding work does not prevent unbounded growth of the task queue managed inside. How do you intend to keep all work in progress bounded in size within the network stack? Bounding these queues would require for example that sending a block fetch request might result in a QueueOverflowError, which is currently not foreseen in the API. Peer selection for block fetchAs you stated, the block fetch operation shall be dispatched to the appropriate peers by the p2p network stack, thus simplifying the Amaru consensus codebase. Currently, it seems that a block fetch request will be sent to one node only, namely the one whose visitor happens to be called first during a housekeeping operation. We’ll need to make this more selective and also ask a configurable number of peers instead of one — typically we ask the peer that sent the corresponding header, plus a small number of nodes that have proven to respond quickly in the past. How would you foresee to implement such logic given the current internal design? Performance concernsThe design of the behaviour is such that it will dispatch each event to all sub-behaviours in sequence, while for housekeeping it will invoke all behaviours of all peer connections. This shouldn’t be an issue as long as the number of connections is small, but it may become an issue when managing 500 connections (which we think should be within the operational parameter range of Amaru). Currently, the housekeeping task only runs every 3sec by default (as per this PR), but I also saw that fetching a block is only triggered by the next housekeeping tick, which means that we’ll want to run housekeeping every couple of milliseconds or so — otherwise block fetch latency would impact the Amaru node’s timeliness constraints. What is your intended evolution of this aspect of the implementation? It seems to me that focusing event-driven updates only on those tasks that need polling will become necessary at a certain usage intensity of this network stack. |
|
@scarmuega Have you had a chance to look at @rkuhn's questions above? We really need to start ramping up our network stack sooner rather than later and would love to align on how pallas-network2 can align with Amaru needs. |
|
Closing this as the work on the networking has continued in the meantime and diverged from this initial setup. Thanks for starting the discussion Santi; |
This pull request introduces the new P2P network machinery from pallas. The new network stack lives in the
pallas-network2crate and is independent from the previous network code, meaning, they can co-exist side-by-side in the same crate.I won't go into details here of how the internals of the p2p machinery work, but I'll explain the specific integration strategy:
pull_p2p.rs) now represents an abstraction for pulling data from a whole network of peers instead of a single upstream peer like before. This stage owns the whole network interface and is responsible of mediating between downstream stages and the external peers. Please note that there are no background threads / stack introduced by the network stack, the "tick" of the stage is what progresses the state of the network (by callingnetwork.poll_next()).