Last year (2014) the company decided to change the service framework to Finagle (follow-up article details), but the company business system is mostly written in C #, and then finagle only provide Scala/java client so can only do their own hands and clothing, the project used Zookpeerclient+thriftclient then encapsulates the client loadblance part and the failover part itself.
Usage Scenario: Web server (Window IIS Mounts MVC4 Web site, MVC4 uses encapsulated client) to access Finagle server (Linux cluster)
Symptom: Web services have a large number of communication ports time-wait use that cause web users to be inaccessible (ports are not enough)
Analysis: Time-wait occurs because the TCP short link is actively closed on the side, and then the Finagle client in the encapsulated thrift client is using TcpClient and then each time the open close causes the whole problem.
Resolution: 1:window Server Modified time-wait time but not reliable (in case the newly deployed server forgot to change what to do)
2: Modified client changed pool mode just recently read Redis client feel Https://github.com/ServiceStack/ServiceStack.Redis
Add: According to the TCP protocol defined by the 3-time handshake disconnection rules, initiate the socket active shutdown of the socket will enter the TIME_WAIT state, Time_wait state will continue 2 MSL (Max Segment Lifetime), Under Windows, the default is 4 minutes, or 240 seconds, and the socket in the TIME_WAIT state cannot be recycled. The specific phenomenon is for a server processing a large number of short connections, if the server actively shut down the client connection, will result in a large number of servers in the TIME_WAIT state of the socket, or even more than the socket in the established state, Severely affects the processing power of the server and even exhausts the available sockets to stop the service. Time_wait is a mechanism that the TCP protocol uses to ensure that a reassigned socket is not affected by a previously lingering delay redistribution, and is a necessary logical guarantee
There are also problems with supplementing the 2:linux server, and many people resolve it by modifying the core parameters but also cause other problems. Connection: http://blog.csdn.net/dog250/article/details/13760985
Supplement 3: Open keep-alive to solve nginx time-wait problem http://www.cnblogs.com/QLeelulu/p/3601499.html
Written in the last: 1 years ago only recently found the reason. And it's been a shame to leave the company.
Remember the problem caused by time-wait once