LinkedIn Instant Messaging: Support for hundreds of thousands of long connections on a single machine

Source: Internet
Author: User
Tags jboss

Recently we introduced the instant messaging of LinkedIn and finally referred to the types of indicators and read responses. To achieve these capabilities, we need to have a way to push data from the server to the phone or Web client over a long connection, rather than the standard request-response pattern taken by many contemporary applications. In this article we will describe how to send them to the client immediately after we have received the message, the type indicator, and the read reply.

The content will include how we use the play framework and the Akka Actor model to manage long connections and proactively send events by the server. We will also share some tips on how we do load tests on servers in a production environment to manage hundreds of thousands of concurrent long connections. Finally, we will share the various optimization methods we have used throughout the process.

Server Send Event

A server send event (server-sent Events,sse) is a communication technology between client servers that, after a client has established an ordinary HTTP connection to the server, pushes a continuous stream of traffic through this connection to the client when an event occurs. Without requiring the client to continually make subsequent requests. The client uses the EventSource interface to continuously receive events or blocks of data sent by the server in the form of text or event streams without having to close the connection. All modern web browsers support the EventSource interface, and both iOS and Android have ready-made library support.

In our earliest implementations, we chose the SSE technology based on WebSockets, because it can be based on the traditional HTTP work, and we also hope that the protocol we adopt can be the most compatible with LinkedIn members, who will access our site from a wide variety of networks. Based on this concept, WebSockets is a technology that enables bidirectional, full-duplex communication, which can be used as a candidate for a protocol, and we will upgrade it when appropriate.

Messages sent by the play framework and the server

Our server-side program for LinkedIn uses the play framework. Play is an open-source, lightweight, fully asynchronous framework that you can use to develop Java and Scala programs. It has its own support for EventSource and WebSockets. To be able to maintain hundreds of thousands of SSE long connections in an extensible manner, we used the play and Akka together. Akka allows us to improve the abstract model and use the Actor model to assign an actor to the connection established for each server.

Client A connects to the server and is assigned Connectionida

Public Result Listen () {

Return OK (eventsource.whenconnected (EventSource, {

String ConnectionID = Uuid.randomuuid (). toString ();

Construct an Akka Actor with the new EventSource connection identified by a random connection identifier

Akka.system (). Actorof (

Clientconnectionactor.props (ConnectionID, EventSource),

ConnectionID);

}));

}

The code above demonstrates how to use the EventSource API of play to accept and establish a connection in the program controller, and then place it under the management of a Akka actor. The Actor is then responsible for managing the entire lifecycle of the connection, and sending the data to the client when an event occurs is simplified to send the message to the Akka Actor.

User B sends a message to User a

We identify the Actor which manages the connection on which User A is connected (Connectionida)

Actorselection actorselection = Akka.system (). Actorselection ("akka://application/user/" + ConnectionIdA);

Send B ' s message to A ' s Actor

Actorselection.tell (new Clientmessage (data), Actorref.nosender ());

Note that the only place to interact with this connection is to send a message to the Akka actor that manages the connection. This is important so that the Akka has asynchronous, non-blocking, high performance, and features designed for distributed systems. Accordingly, the Akka actor handles the message it receives by forwarding it to the EventSource connection it manages.

public class Clientconnectionactor extends Untypedactor {

public static Props Props (String ConnectionID, EventSource EventSource) {

Return Props.create (Clientconnectionactor.class, ()-New Clientconnectionactor (ConnectionID, EventSource));

}

public void OnReceive (Object msg) throws Exception {

if (msg instanceof clientmessage) {

Eventsource.send (Event (Json.tojson (Clientmessage)));

}

}

}

That's it. Using the play framework and the Akka Actor model to manage concurrent EventSource connections is as simple as that.

But after the size of the system, can this work well? Read the following to know the answer.

Use real production environment flow to do stress testing

All systems are ultimately tested with real production flow, but the actual production flow is not so easy to replicate, because there are not many tools that you can use to simulate stress testing. But before we deploy to a real production environment, how do we test with real production traffic? At this point we use a technique called "surface start", which will be discussed in detail in our next article.

In order for this article to focus only on its own theme, let's assume that we can already generate real production pressure in our server cluster. An effective way to test the limits of a system is to increase the pressure on a single node so that the entire production cluster is exposed to the problem of exposure when it is under great pressure.

With this approach and other means of assistance, we have found several limitations of the system. In the next few sections, we'll talk about how we can use a few simple optimizations to allow a single server to eventually support hundreds of thousands of connections.

Limit one: a Socket The maximum number of connections on the pending state

In some of the earliest stress tests we often encountered a strange problem, we have no way to establish a number of connections at the same time, about 128 to the limit. Note that the server can easily handle thousands of concurrent connections, but we are not able to add more than 128 connections to the connection pool at the same time. In a real production environment, this is roughly equivalent to having 128 members initializing connections to the same server at the same time.

After doing some research, we found the following kernel parameters:

Net.core.somaxconn

This kernel parameter means that the program is ready to accept the maximum number of TCP connections waiting to establish a connection state. If a connection creation request is made when the queue is full, the request is rejected directly. On many major operating systems, this value is 128 by default.

When this value is changed in the "/etc/sysctl.conf" file, it resolves the "Deny connection" issue on our Linux server.

Please note that the Netty 4.x version and above will automatically fetch this value from the operating system and use it directly when initializing Java ServerSocket. However, if you also want to configure it at the application level, you can set this in the configuration parameters of the play program:

play.server.netty.option.backlog=1024

Limit two: JVM Number of Threads

After allowing the larger production flow to press our server for the first time, we received an alarm in a few hours and the load balancer started to have no way to connect to a subset of the servers. After further investigation, we found the following in the server log:

Java.lang.OutOfMemoryError:unable to create new native thread

The following diagram of the number of JVM threads on our server also confirms that we had a thread leak and the memory was running out.

We took a look at the thread state of the JVM process and found many sleep threads in the following states:

"Hashed Wheel timer #11327" #27780 prio=5 os_prio=0 tid=0x00007f73a8bba000 nid=0x27f4 sleeping[0x00007f7329d23000] Java . lang. Thread.State:TIMED_WAITING (sleeping)

At Java.lang.Thread.sleep (Native Method)

At Org.jboss.netty.util.hashedwheeltimer$worker.waitfornexttick (hashedwheeltimer.java:445)

At Org.jboss.netty.util.hashedwheeltimer$worker.run (hashedwheeltimer.java:364)

At Org.jboss.netty.util.ThreadRenamingRunnable.run (threadrenamingrunnable.java:108)

At Java.lang.Thread.run (thread.java:745)

After further investigation, we found that the reason for LinkedIn's support for Netty's idle timeout mechanism in the implementation of the play framework is that it creates a new Hashedwheeltimer instance for each incoming connection in the original play Framework code. This patch is a very clear explanation of the cause of this bug.

If you also have a problem with the JVM thread limit, there is a good chance that there will be some thread leaks in your code that need to be addressed. But if you find that all your threads are working and you're doing what you want, is there a way to change the system to allow you to create more threads and accept more connections?

As always, the answer is still very interesting. It is an interesting topic to discuss the relationship between limited memory and the number of threads that can be created in the JVM. The stack size of a thread determines the amount of memory that can be used to allocate static memory. Thus, the theoretical maximum number of threads is the size of the user address space of a process divided by the stack size of the thread. However, the JVM will actually use the memory for dynamic allocation on the heap. After doing some simple experiments with a small Java program, we confirmed that if the heap allocates more memory, the stack can use less memory. In this way, the limit on the number of threads decreases as the heap size increases.

The conclusion is that if you want to increase the number of threads, you can reduce the stack size (-XSS) used by each thread, or you can reduce the memory allocated to the heap (-XMS,-XMX).

Limit three: Temporary port exhaustion

In fact, we didn't really reach that limit, but we wanted to write it here, because this is usually the limit when you want to support hundreds of thousands of connections on a single server. Each time the load balancer connects to the previous server node, it consumes a temporary port. During the lifetime of this connection, this port is associated with it, so call it "temporary". When the connection is terminated, the temporary port is freed and can be reused. However, long connections are not terminated as normal HTTP connections, so the pool of available temporary ports on the load balancer will eventually be exhausted. The state of the moment is that there is no way to create a new connection, because the port number that all the operating systems can use to establish a new connection has been used. There are a number of ways to resolve temporary port exhaustion issues on newer load balancers, but those are not within the scope of this article.

Luckily, each of our load balancers can support up to 250,000 connections. However, when you reach this limit, work with the team that manages your load balancer to increase the number of open connections between your load balancer and your server nodes.

Limit four: File descriptors

When we set up 16 servers in our data center and we can handle very impressive production traffic, we decided to test the limit on the number of long connections each server can afford. The specific test method is to shut down several servers at a time so that the load balancer will lead more and more traffic to the remaining servers. This kind of test produced the following wonderful diagram, indicating the number of file descriptors used by our server processes on each server, and we internally gave it a nickname: "Caterpillar Map".

A file descriptor is an abstract handle in a Unix-like operating system, and unlike other, it is used to access a network socket. Not surprisingly, the more persistent connections that are supported on each server, the more file descriptors you need to allocate. As you can see, when 16 servers are left with only 2, they use 20,000 file descriptors per station. When we turned off one of them again, we saw the following log on the rest of the table:

Java.net.SocketException:Too Many files open

When all connections are directed to a unique server, we reach the single-process file descriptor limit. To see the number of file descriptor limits available for a process, you can view the values of the "Max open files" for this file.

$ cat/proc/<pid>/limits

Max Open Files 30000

As the following example, this can be increased to 200,000, just add the following line to the file/etc/security/limits.conf:

<process username> Soft Nofile 200000

<process username> Hard Nofile 200000

Note There is also a system-level file descriptor limit that can be adjusted for kernel parameters in file/etc/sysctl.conf:

Fs.file-max

This allows us to increase the single-process file descriptor limit on all servers, so you can see that we are now able to easily handle more than 30,000 connections per server.

Limit Five: JVM Heap

In the next step, we repeated the process, but the situation started to get worse when we took about 60,000 connections to the surviving one of the remaining two servers. The number of assigned file descriptors, as well as the number of corresponding active long connections, has been greatly reduced, and the delay has risen to unacceptable proportions.

After further investigation, we found that we ran out of 4GB of JVM heap space. This also creates the following rare diagram, showing that each memory collector can reclaim less heap space until the end is all exhausted.

We use TLS to handle all internal communications in the Instant Messaging service in the data center. In practice, each TLS connection consumes approximately 20KB of memory from the JVM and increases as the number of active long connections grows, resulting in a memory exhaustion state as shown.

We tuned the JVM heap size to 8GB (-xms8g,-xmx8g) and ran the test again and again and again and again, to a server that was getting more and more connections, and eventually ran out of memory when one server handled about 90,000 connections, and the number of connections began to drop.

In fact, we ran out of heap space again, this time it was 8G.

Processing power is never used to reach the limit, because CPU utilization has been below 80%.

How are we going to measure it next? Because each of our servers is very extravagant with a 64GB memory configuration, we directly tuned the JVM heap size to 16GB. Since then, we have not reached this memory limit in the performance test, but also successfully processed more than 100,000 concurrent long connections in the production environment. However, as you have seen in the above, we will encounter certain limitations when the pressure continues to increase. What do you think it's going to be? Memory? Cpu? Please tell me what you think through my Twitter account @agupta03.

Conclusion

In this article, we briefly describe the situation in which LinkedIn keeps a long connection in order to send unsolicited messages to the Instant Messaging client. It turns out that Akka's Actor model is very useful for managing these connections in the play framework.

Constantly challenging the limits of our production system and trying to improve it, such things are what we like to do most in LinkedIn. We have shared some interesting limitations and workarounds that we have encountered in the course of our many challenges and ultimately allowing our single Instant Messaging server to handle hundreds of thousands of long connections. We share these details so that you can understand the reasons behind each of the limitations of each technology so that you can squeeze out the best performance of your system. I hope you can learn something from our article and apply it to your own system.

from:http://mp.weixin.qq.com/s?__biz=mza5nzkxmzg1nw==&mid=2653161464&idx=1&sn= 8a31ae325c547a2eeb9dc20c1ab25bdc&chksm= 8b493a96bc3eb3800ec78dc94f925f67e14128a1db0a8a2fb4c6186cbf381e311000721d640b#rd

LinkedIn Instant Messaging: Support for hundreds of thousands of long connections on a single machine

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.