Author: freewind
Compared to the original project warehouse:
GitHub Address: Https://github.com/Bytom/bytom
Gitee Address: Https://gitee.com/BytomBlockc ...
In the previous article, we said that when the block data is requested from other nodes, a chunk is sent to the other node BlockKeeper
BlockRequestMessage
height
, and the binary data corresponding to the information is placed ProtocolReactor
in the corresponding sendQueue
channel, waiting to be sent. and the specific transmission details, due to complex logic, so in the previous article is not detailed, put in this article.
Because sendQueue
it is a channel, the data put in, in the end is who in what circumstances to take away and send, BlockKeeper
this side is not known. After we searched in the code, we found that only one type would directly monitor sendQueue
the data in it, which was the previous article MConnection
. MConnection
object in its OnStart
method, the data is monitored, and sendQueue
then, when the data is found, it is taken away and placed in a called sending
channel.
Things are getting a little complicated:
- As we know from the previous chapter, one
MConnection
corresponds to a connection to a peer, and there are a number of situations where there is a connection between the original nodes: such as actively connecting to other nodes, or other nodes actively connecting me
- After placing the channel
sending
, we also need to know who is under what circumstances will be monitored sending
, take away the data inside it
sending
After the data is taken away, how is it sent to other nodes?
Or as before, encountered complex problems, we first through the "mutual independence, completely exhausted" principle, the decomposition of it into a small problem, and then resolved in turn.
So the first thing we need to figure out is:
Under what circumstances would MConnection
the object be created and its OnStart
method called?
(So we know sendQueue
how the data is monitored.)
After analysis, we found that MConnection
the start-up only appeared in one place, that is, in Peer
the OnStart
method. So the question becomes: Under what circumstances would Peer
the object be created and its OnStart
method called?
Again after a toss, finally determined that in the comparison, in the following 4 scenarios the Peer.OnStart
method will eventually be called:
- When starting from the original node, actively go to the seed node specified in the configuration file, as well as the nodes stored in the local data directory.
addrbook.json
- Compared to the original listening to the local peer port, a different node when connected to the time
- Start and
PEXReactor
use its own protocol to communicate with the node on the current connection
- In a method that is not used
Switch.Connect2Switches
(can be ignored)
The 4th situation is completely ignored. In the 3rd case, the PEXReactor
logic is more independent, because it uses a file-sharing protocol like BitTorrent to share data with other nodes, and we don't consider it as a complementary function. So we just need to analyze the first two cases.
How do you actively connect to other nodes and eventually call the method when the original node is started MConnection.OnStart
?
First we quickly go to the SyncManager.Start
method:
Cmd/bytomd/main.go#l54
func main() { cmd := cli.PrepareBaseCmd(commands.RootCmd, "TM", os.ExpandEnv(config.DefaultDataDir())) cmd.Execute()}
Cmd/bytomd/commands/run_node.go#l41
func runNode(cmd *cobra.Command, args []string) error { n := node.NewNode(config) if _, err := n.Start(); err != nil { // ...}
node/node.go#l169
func (n *Node) OnStart() error { // ... n.syncManager.Start() // ...}
netsync/handle.go#l141
func (sm *SyncManager) Start() { go sm.netStart() // ...}
Then we'll go into the netStart()
method. In this method, the other nodes are actively connected:
func (sm *SyncManager) netStart() error { // ... if sm.config.P2P.Seeds != "" { // dial out seeds := strings.Split(sm.config.P2P.Seeds, ",") if err := sm.DialSeeds(seeds); err != nil { return err } } return nil}
In this case sm.config.P2P.Seeds
, the seed nodes in the local data directory are the corresponding config.toml
p2p.seeds
.
Then sm.DialSeeds
actively connect each seed by going to:
netsync/handle.go#l229-l231
func (sm *SyncManager) DialSeeds(seeds []string) error { return sm.sw.DialSeeds(sm.addrBook, seeds)}
p2p/switch.go#l311-l340
func (sw *Switch) DialSeeds(addrBook *AddrBook, seeds []string) error { // ... for i := 0; i < len(perm)/2; i++ { j := perm[i] sw.dialSeed(netAddrs[j]) } // ...}
p2p/switch.go#l342-l349
func (sw *Switch) dialSeed(addr *NetAddress) { peer, err := sw.DialPeerWithAddress(addr, false) // ...}
p2p/switch.go#l351-l392
func (sw *Switch) DialPeerWithAddress(addr *NetAddress, persistent bool) (*Peer, error) { // ... peer, err := newOutboundPeerWithConfig(addr, sw.reactorsByCh, sw.chDescs, sw.StopPeerForError, sw.nodePrivKey, sw.peerConfig) // ... err = sw.AddPeer(peer) // ...}
First by newOutboundPeerWithConfig
creating peer
it and then adding it to sw
(that is, the Switch
object).
p2p/switch.go#l226-l275
func (sw *Switch) AddPeer(peer *Peer) error { // ... // Start peer if sw.IsRunning() { if err := sw.startInitPeer(peer); err != nil { return err } } // ...}
In sw.startInitPeer
, it will be called peer.Start
:
p2p/switch.go#l300-l308
func (sw *Switch) startInitPeer(peer *Peer) error { peer.Start() // ...}
And the peer.Start
Peer.OnStart
last one is:
p2p/peer.go#l207-l211
func (p *Peer) OnStart() error { p.BaseService.OnStart() _, err := p.mconn.Start() return err}
Can be seen, called here mconn.Start
, and finally found. A summary is:
Node.Start
---- SyncManager.Start
SyncManager.netStart
Switch.DialSeeds
Switch.AddPeer
Switch.startInitPeer
Peer.OnStart
MConnection.OnStart
Well, the first kind of active connection to the other nodes is here to analyze. Here is the second case:
When other nodes are connected to this node, how do you get to MConnection.OnStart
this step of the method?
When the original node is started, it listens to the local peer port and waits for other nodes to connect. So what is the process like?
Since the starting process of the original node has appeared many times in the current article, this is not posted here, we are directly from the Switch.OnStart
beginning (it started at the SyncManager
time of startup):
p2p/switch.go#l186-l185
func (sw *Switch) OnStart() error { // ... for _, peer := range sw.peers.List() { sw.startInitPeer(peer) } // Start listeners for _, listener := range sw.listeners { go sw.listenerRoutine(listener) } // ...}
After this method is omitted, there are two pieces of code left, one piece is, the other startInitPeer(...)
is sw.listenerRoutine(listener)
.
If you've just read the previous section, you'll notice that the startInitPeer(...)
method will be called immediately Peer.Start
. What needs to be explained here, however, is that, after my analysis, I find that this code does not actually do anything, because at this point in time, sw.peers
it is always empty, and it has not yet been added to the peer by other code. So I think it can be deleted to avoid misleading readers. (mention a issue, see #902)
The second piece of code, listenerRoutine
if you still have the impression that it is used to listen to the local peer-to port, in front of the "How to listen to the peer port" in the article detailed explanation.
We still need to dig it up today to see how it went MConnection.OnStart
:
p2p/switch.go#l498-l536
func (sw *Switch) listenerRoutine(l Listener) { for { inConn, ok := <-l.Connections() // ... err := sw.addPeerWithConnectionAndConfig(inConn, sw.peerConfig) // ... }}
Here l
is the listener that listens to the local peer port. Through a for
loop, get the connection to the node connected to that port and generate a new peer.
func (sw *Switch) addPeerWithConnectionAndConfig(conn net.Conn, config *PeerConfig) error { // ... peer, err := newInboundPeerWithConfig(conn, sw.reactorsByCh, sw.chDescs, sw.StopPeerForError, sw.nodePrivKey, config) // ... if err = sw.AddPeer(peer); err != nil { // ... } // ...}
The method that was called after the new peer was generated Switch
AddPeer
. Here, as in the previous section, will be called in, then called, and AddPeer
sw.startInitPeer(peer)
peer.Start()
finally called MConnection.OnStart()
. As the code is identical, it is not posted.
To summarize, it is:
Node.Start
----- SyncManager.Start
SyncManager.netStart
Switch.OnStart
Switch.listenerRoutine
Switch.addPeerWithConnectionAndConfig
Switch.AddPeer
Switch.startInitPeer
Peer.OnStart
MConnection.OnStart
So, we're done with the second case.
But so far, we have only solved the first small problem in this problem, namely: we finally know what the original code will be in the case to start a MConnection
, so as to monitor the sendQueue
channel, to send the information data, transferred to the sending
channel.
So, let's go to the next little question:
After the data is put into sending
the channel, who will come to take them away?
After analysis, it is found that sendQueue
the channel and sending
all belong Channel
to the type, but the two functions are different. sendQueue
is used to store the complete information data to be sent, and at the sending
lower level, the data it holds may be sent into chunks. If there is only sendQueue
one channel, it is very difficult to achieve the operation of chunking.
And Channel
The send is MConnection
called by the, fortunately, when we have been back to the track, and found that actually came MConnection.OnStart
here. That is to say, we are in this small problem, the study is exactly the previous two chain behind the part:
Node.Start
----- SyncManager.Start
SyncManager.netStart
Switch.DialSeeds
Switch.AddPeer
Switch.startInitPeer
Peer.OnStart
MConnection.OnStart
???
Node.Start
------ SyncManager.Start
SyncManager.netStart
Switch.OnStart
Switch.listenerRoutine
Switch.addPeerWithConnectionAndConfig
Switch.AddPeer
Switch.startInitPeer
Peer.OnStart
MConnection.OnStart
???
That's the upper ???
part.
So we'll start right from the MConnection.OnStart
beginning:
p2p/connection.go#l152-l159
func (c *MConnection) OnStart() error { // ... go c.sendRoutine() // ...}
c.sendRoutine()
The method is what we need. When MConnection
It is started, a send operation is started (waiting for data to arrive). Its code is as follows:
p2p/connection.go#l289-l343
func (c *MConnection) sendRoutine() { // ... case <-c.send: // Send some msgPackets eof := c.sendSomeMsgPackets() if !eof { // Keep sendRoutine awake. select { case c.send <- struct{}{}: default: } } } // ...}
This method is very long, but we omitted a lot of irrelevant code. Inside c.sendSomeMsgPackets()
is what we are looking for, but, we suddenly find out, how come out a c.send
channel? What's the use of it? And it seems as if there is something in this channel that we call c.sendSomeMsgPackets()
, and it seems like a bell to remind us.
So c.send
when will there be something? After checking the code, it is found in the following 3 places:
p2p/connection.go#l206-l239
func (c *MConnection) Send(chID byte, msg interface{}) bool { // ... success := channel.sendBytes(wire.BinaryBytes(msg)) if success { // Wake up sendRoutine if necessary select { case c.send <- struct{}{}: // ..}
p2p/connection.go#l243-l271
func (c *MConnection) TrySend(chID byte, msg interface{}) bool { // ... ok = channel.trySendBytes(wire.BinaryBytes(msg)) if ok { // Wake up sendRoutine if necessary select { case c.send <- struct{}{}: // ...}
p2p/connection.go#l289-l343
func (c *MConnection) sendRoutine() { // .... case <-c.send: // Send some msgPackets eof := c.sendSomeMsgPackets() if !eof { // Keep sendRoutine awake. select { case c.send <- struct{}{}: // ...}
If we still have an impression on the previous article, we will remember that channel.trySendBytes
it is called when we want to send information to the other node, and after the call, it will put the information corresponding to the binary data into the channel.sendQueue
channel (so only this article). channel.sendBytes
Although we are not yet available, it should be similar. After they have two calls, they will c.send
put a data into the channel to notify that Channel
the data can be sent.
And the third one sendRoutine()
is where we just went. c.sendSomeMsgPackets()
after we have sent a sending
portion of the call, if there are any remaining, continue to c.send
put the data, reminders can continue to send.
So far, the transmission of data involves three channel, respectively, sendQueue
sending
and send
. The reason why this is so complicated is that you want to send data chunked.
Why do you want to send it in chunks? This is because the speed of the nodes can be maintained at a reasonable level, as compared to the original hope of controlling the transmission rate. If there is no limit, a large number of data, one is likely to let the receiver too late to deal with, the second is likely to be exploited by malicious nodes, request a large number of chunks of data to fill the bandwidth.
Worried sendQueue
, sending
and send
these three channels are not well understood, I think of a "roast duck shop" metaphor, to understand them:
sendQueue
It is like a hook to hang roasted ducks, there can be many (but for the original said, the default is only one, because sendQueue
the capacity of the default is 1
), when the roast duck baked well, hanging on the hook;
sending
is the chopping board, you can put the roast duck from the sendQueue
hook to take down a, put on top cut into pieces, waiting for the plate, a roast duck may be loaded into several plates;
- And
send
is the bell, when someone orders, the waiter will press the Bell, the chef from the sending
chopping board to take a few pieces of roast duck placed in the small plate in the mouth. Because the chef is very busy, each cut out of a plate may be to do other things, and forget the sending
chopping board and burning ducks did not have a plate, so in order to prevent himself forget, he every cut out of a plate, will look at the sending
chopping board, if there is meat, will press the bell remind yourself to continue to install the plate.
Well, send
after understanding, we can go back to the main line and continue to look at c.sendSomeMsgPackets()
the code:
p2p/connection.go#l347-l360
func (c *MConnection) sendSomeMsgPackets() bool { // Block until .sendMonitor says we can write. // Once we're ready we send more than we asked for, // but amortized it should even out. c.sendMonitor.Limit(maxMsgPacketTotalSize, atomic.LoadInt64(&c.config.SendRate), true) // Now send some msgPackets. for i := 0; i < numBatchMsgPackets; i++ { if c.sendMsgPacket() { return true } } return false}
c.sendMonitor.Limit
is to limit the send rate, where maxMsgPacketTotalSize
the maximum length for each packet is constant 10240
, the second parameter is the pre-specified send rate, the default is 500KB/s
, and the third parameter is that when the actual speed is too large, the send is paused until it becomes normal.
After the speed limit adjustment, the following section will be able to send data normally, which c.sendMsgPacket
is the way we continue to see:
p2p/connection.go#l363-l398
func (c *MConnection) sendMsgPacket() bool { // ... n, err := leastChannel.writeMsgPacketTo(c.bufWriter) // .. c.sendMonitor.Update(int(n)) // ... return false}
At the very beginning of this method I omitted a large piece of code, which is to check multiple channel, combined with their priority and the amount of data already sent, to find the current most need to send the data of the channel, recorded as leastChannel
.
Then it is called to leastChannel.writeMsgPacketTo(c.bufWriter)
write the piece of data that is currently being sent bufWriter
. This bufWriter
is a buffer that is actually bound to the connection object, and the data written to it is sent out by go. It is defined in the place where it was created MConnection
:
p2p/connection.go#l114-l118
func NewMConnectionWithConfig(conn net.Conn, chDescs []*ChannelDescriptor, onReceive receiveCbFunc, onError errorCbFunc, config *MConnConfig) *MConnection { mconn := &MConnection{ conn: conn, bufReader: bufio.NewReaderSize(conn, minReadBufferSize), bufWriter: bufio.NewWriterSize(conn, minWriteBufferSize),
Which minReadBufferSize
is 1024
, minWriteBufferSize
for 65536
.
After the data is written bufWriter
, we do not need to care about it, and go to the operation.
leastChannel.writeMsgPacketTo(c.bufWriter)
after the call is finished, it is updated later c.sendMonitor
so that it can continue with the correct speed limit.
We already know how the data was sent out, but we haven't found the data on who's watching sending
, so let's keep looking leastChannel.writeMsgPacketTo
:
p2p/connection.go#l655-l663
func (ch *Channel) writeMsgPacketTo(w io.Writer) (n int, err error) { packet := ch.nextMsgPacket() wire.WriteByte(packetTypeMsg, w, &n, &err) wire.WriteBinary(packet, w, &n, &err) if err == nil { ch.recentlySent += int64(n) } return}
Where is the ch.nextMsgPacket()
next piece of data to be sent, and where is it taken out? Is it from sending
?
The subsequent code is to turn the block object into binary and send it into the previous one bufWriter
.
Continue ch.nextMsgPacket()
:
p2p/connection.go#l638-l651
func (ch *Channel) nextMsgPacket() msgPacket { packet := msgPacket{} packet.ChannelID = byte(ch.id) packet.Bytes = ch.sending[:cmn.MinInt(maxMsgPacketPayloadSize, len(ch.sending))] if len(ch.sending) <= maxMsgPacketPayloadSize { packet.EOF = byte(0x01) ch.sending = nil atomic.AddInt32(&ch.sendQueueSize, -1) // decrement sendQueueSize } else { packet.EOF = byte(0x00) ch.sending = ch.sending[cmn.MinInt(maxMsgPacketPayloadSize, len(ch.sending)):] } return packet}
I finally saw sending
it. From here, it sending
is true that there are many pieces of duck's chopping board, and it packet
is a small plate, so it is necessary to put the data in the binate for sending
no more than the specified length packet
, and then judge whether there is sending
any left. If there is, then packet
the EOF
value is 0x00
, otherwise 0x01
, so that the caller knows whether the data is not finished, but also need not to press that called send
Bell.
So until then, we know that the channel itself is concerned sending
, and in order to limit the speed of transmission, it needs to be cut into small pieces.
Finally, we have a third small problem, in fact, we have just in the second question has been made clear.
sending
After the data is taken away, how is it sent to other nodes?
The answer is that sending
when the data is taken out of a block, it is put bufWriter
in and sent directly by the Go net.Conn
object. At this level, we don't need to go any further.
Summarize
Since the method calls involved in this article are much more, it may be messy to read, so in the end, we call the chain complement complete, put in the last:
Node.Start
----- SyncManager.Start
SyncManager.netStart
Switch.DialSeeds
Switch.AddPeer
Switch.startInitPeer
Peer.OnStart
MConnection.OnStart
...
Node.Start
------ SyncManager.Start
SyncManager.netStart
Switch.OnStart
Switch.listenerRoutine
Switch.addPeerWithConnectionAndConfig
Switch.AddPeer
Switch.startInitPeer
Peer.OnStart
MConnection.OnStart
...
Then it is:
MConnection.sendRoutine
---- MConnection.send
MConnection.sendSomeMsgPackets
MConnection.sendMsgPacket
MConnection.writeMsgPacketTo
MConnection.nextMsgPacket
MConnection.sending
In the end, my feeling is that a complex problem looks scary at first, but once it's broken down into small problems, it's not as complicated as it is to focus on one conquer at a time.