TCP self-connection

來源:互聯網
上載者:User
這是一個建立於 的文章,其中的資訊可能已經有所發展或是發生改變。

Go 語言的 net 庫裡有下面這樣的一段代碼,這段代碼是用來發起一個 tcp 串連的,仔細閱讀這段代碼可以發現代碼裡處理了一種很不常見的特殊情形,也就是 tcp self-connection。代碼中的注釋解釋得很詳細了。

func dialTCP(net string, laddr, raddr *TCPAddr, deadline time.Time) (*TCPConn, error) {fd, err := internetSocket(net, laddr, raddr, deadline, syscall.SOCK_STREAM, 0, "dial", sockaddrToTCP)// TCP has a rarely used mechanism called a 'simultaneous connection' in// which Dial("tcp", addr1, addr2) run on the machine at addr1 can// connect to a simultaneous Dial("tcp", addr2, addr1) run on the machine// at addr2, without either machine executing Listen.  If laddr == nil,// it means we want the kernel to pick an appropriate originating local// address.  Some Linux kernels cycle blindly through a fixed range of// local ports, regardless of destination port.  If a kernel happens to// pick local port 50001 as the source for a Dial("tcp", "", "localhost:50001"),// then the Dial will succeed, having simultaneously connected to itself.// This can only happen when we are letting the kernel pick a port (laddr == nil)// and when there is no listener for the destination address.// It's hard to argue this is anything other than a kernel bug.  If we// see this happen, rather than expose the buggy effect to users, we// close the fd and try again.  If it happens twice more, we relent and// use the result.  See also://http://golang.org/issue/2690//http://stackoverflow.com/questions/4949858///// The opposite can also happen: if we ask the kernel to pick an appropriate// originating local address, sometimes it picks one that is already in use.// So if the error is EADDRNOTAVAIL, we have to try again too, just for// a different reason.//// The kernel socket code is no doubt enjoying watching us squirm.for i := 0; i < 2 && (laddr == nil || laddr.Port == 0) && (selfConnect(fd, err) || spuriousENOTAVAIL(err)); i++ {if err == nil {fd.Close()}fd, err = internetSocket(net, laddr, raddr, deadline, syscall.SOCK_STREAM, 0, "dial", sockaddrToTCP)}if err != nil {return nil, &OpError{Op: "dial", Net: net, Addr: raddr, Err: err}}return newTCPConn(fd), nil}func selfConnect(fd *netFD, err error) bool {// If the connect failed, we clearly didn't connect to ourselves.if err != nil {return false}// The socket constructor can return an fd with raddr nil under certain// unknown conditions. The errors in the calls there to Getpeername// are discarded, but we can't catch the problem there because those// calls are sometimes legally erroneous with a "socket not connected".// Since this code (selfConnect) is already trying to work around// a problem, we make sure if this happens we recognize trouble and// ask the DialTCP routine to try again.// TODO: try to understand what's really going on.if fd.laddr == nil || fd.raddr == nil {return true}l := fd.laddr.(*TCPAddr)r := fd.raddr.(*TCPAddr)return l.Port == r.Port && l.IP.Equal(r.IP)}

tcp 這個不常顯的自串連現象,可以用如下的一段指令碼程式來複現:

while truedotelnet 127.0.0.1 40000done

本地機器並沒有啟動一個監聽在40000連接埠的伺服器程式,但是執行這段指令碼一段時間,就會發現 telnet 程式串連上了,通過 netstat 看到的現象還是串連到自己,不是別的服務。

注意:只有當這裡選擇的連接埠位於這個範圍的時候會出現這個現象:

> cat /proc/sys/net/ipv4/ip_local_port_range3276861000

為什麼會出現自串連的情況,可以自行 Google,已經有很多文章基於 tcp 狀態機器做了詳細解釋。我個人認為:你可以把這個情況看做是 Linux 協議棧的一個 Bug,也可以看作是協議棧的一個特性,這都無關緊要;重要的是要清楚——“我們在寫網路程式的時候,串連斷開後,不斷做建連重試就有小機率的情況會發生自串連”。

所以,在寫網路程式的時候,我們應該主動的去處理自串連的情況,就像上面 Go 語言的處理方式就可以;另外,我們在選擇伺服器連接埠的時候,也可以稍加考慮,避免選擇 /proc/sys/net/ipv4/ip_local_port_range 這個檔案中描述的連接埠範圍。

你的代碼處理了這種情形了嗎?

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.