Objective
LinkedIn's Instant messaging system now has a single machine that can handle hundreds of thousands of of persistent connections, which is the result of constant tuning.
Recently, they published the optimization process in the website blog, introduced the technology selection of the instant communication system, the focus of tuning.
Basic technology composition
The basic requirement of instant messaging is that server can push data to the client, which needs to be implemented through persistent connections rather than the traditional "request-response" pattern.
For this requirement, LinkedIn opted for server-sent events (SSE).
SSE is characterized by simple, good compatibility, the client only needs to establish a normal HTTP connection with the server, when there is an event in the server, it will push the data stream to the client.
With SSE EventSource interface is supported by all modern browsers, IOS and Android also has a ready-made library, so SSE compatibility is not a problem, this is not the choice of LinkedIn Websockets reason.
The development language uses JAVA, the programming model chooses the Actor model, the Akka is a good actor library.
The development framework uses Play, and he can integrate EventSource and Akka well.
Optimization process
Socket maximum number of connections limit
When LinkedIn started doing performance testing, it was not unusual to find that concurrent connections never exceeded 128, and it was easy for the application server to handle thousands of concurrent connections, and later found to be a limitation of the kernel parameters of a system:
Net.core.somaxconn
This parameter controls the number of TCP connections allowed, and when a connection request comes in, if the number reaches the limit, it is rejected, and 128 is the default value for many systems.
Adjustments can be made in the/etc/sysctl.conf.
Restrictions on ephemeral ports
Each time the load balancer connects to a server node, a temporary port is used, and the port is available again when the connection is terminated.
Persistent connections are not terminated as normal HTTP connections, so the temporary port of the load balancer may be exhausted.
This requires special attention when choosing a load balancer.
Limitations of file descriptors
After increasing the test pressure, an exception occurred:
Java.net.SocketException:Too Many files open
This indicates that the file descriptor is not enough, in Linux, all files, such as access to the standard file, connect the network socket, and so on, need a file descriptor.
For the file descriptor limit for a running process, you can view it this way:
$ cat/proc/<pid>/limits
...
Max Open Files 30000
Suppose you want to adjust to 200000, modify/etc/security/limits.conf:
<process username> Soft Nofile 200000
<process username> Hard Nofile 200000
The system-level file descriptor restrictions are adjusted in/etc/sysctl.conf:
Fs.file-max
Summary
Here is a few general optimization points, the original text has more detailed description, there are two points to the JVM tuning, interested friends can look at the original text, address:
https://www.1.qixoo.com/blog/2016/10/instant-messaging-at-linkedin--scaling-to-hundreds-of-thousands-
Optimization of the LinkedIn Instant messaging system