WhatsApp has used Erlang in the production environment to run to 96GB memory stand-alone 3M long connection, participate in: WhatsApp Erlang world. After all, the business level can reach WhatsApp that very little, now only tens, single too many hanging one too much impact, and there is no multi-line access, each room has to throw so several machines, so 1M can meet the requirements.
Erlang has a natural advantage as a long-connected gateway:
-Good with IO-intensive business, also require the gateway design as simple as possible, after authentication, simple parsing header, directly forwarded to the backend service processing
-The network layer has beam C implemented very efficiently, Erlang code is just simple process control, that is, code performance is very close to optimization of the good C program
-code is simple, easy to maintain, Erlang process and connection one-to-one relationship, code 500 rows or so
-Process-based gc,node memory size and don't worry about stop of word
-Thermal upgrade, gateway Design This is as simple as possible, reduce the upgrade, but also inevitable, hot upgrade to ensure continuous connection
-Stable, beam stable enough, even if the service is updated every day for one or two consecutive years does not restart also normal
-Multi-language mix, backend can be easily implemented in other languages
Disadvantage: CPU-intensive business, poor performance, if you do not want to use the C-pack, try to use the binary protocol, as far as possible the package protocol identification in the package body fixed position, convenient dispatch
1. Test environment
-Server: R620 (e5-2620 6 core 12 threads)/MEM:128GB
-Client: R620 (e5-2620 6 core 12 threads)/MEM:48GB (5 sets of 25IP IP)
2. Tuning
1. Beam startup parameters
+SBT DB binding Scheduler and CPU affinity
+p 2000000 Process number limit (suitable)
+k true to enable Epoll
+SBWT None Close Beam scheduler spinlock, reduce CPU
+SWT Low improves scheduler wake-up sensitivity to avoid long-running sleep problems
Refer to my other article: Erlang CPU related parameter adjustment
2. Erlang Network library:Ranch needs to be modified
-Configure the number of acceptors 1024,backlog 32768
Simultaneous system backlog Adjustment: Net.core.somaxconn, Net.core.netdev_max_backlog, Net.ipv4.tcp_max_syn_backlog 32768
-delete ranch_sup and connection monitor when new joins, Occupy memory and use little, and single process hotspot problem, also cause backlog can't deal with in time
-Acceptor Set Process_flag (prority, High), otherwise the backlog cannot be processed in time because of a fair scheduling problem, even if the CPU is low
3. Erlang memory consumption
Because Erlang GC is based on processes, each connection corresponds to a process, and each connection has a small amount of business. That would create a phenomenon. Hundreds of W there is a bit of garbage in the process and no GC is available.
Using hibernate,erlang:hibernate will empty the front call stack and force the GC to continue execution with parameter MFA after the next process receives the message.
Using Gen_server simple in the absence of a package after processing, all with hibernate, the CPU has no significant increase in the case of memory usage can be reduced to less than 1/3, is well worth the use.
3. Pressure measurement scene
- Client in 5 machines, 25 IP, open tcp_tw_reuse tcp_tw_recycle after each IP stable running 6w connection still no problem.
-6K/S login, authentication, heartbeat, exit operation
-1W/S message Push, ACK
-12 hours of operation
4. Results
SS-1500254 (kernel 1500317) TCP: 1500090 (estab 1500077, closed 0, orphaned 0, Synrecv 0, timewait 0 /0), portsTransport total IP IPv6* 1500317 - -RAW 0 0 0 UDP 0 0 0TCP 1500090 1500090 0INET 1500090 1500090 0 FRAG 0 0 0
Gateway Machine:
-CPU 500%
-Nic 6w/s TCP packet In/out
-Node mem 12GB (internal memory (total) shows actual use of 9GB)
-Kernel mem 11GB
-that is, 150w connection, using 23GB memory, each connection consumes 15KB, about half of the kernel is used, Erlang uses not much memory.
-500% CPU performance is also good under such traffic pressure, but its multicore scalability is better when compared with C.
-The server has 24 cores, but the environment other services, the client IP number limit, did not continue to increase the pressure, theoretically single-200~300w connection, 5W/S message push should not be a problem.
Erlang c1500k Long Connection push service-performance