Objective
Each of the language in some extreme cases of the performance is generally not the same, then I commonly used in the Java language, in reaching 1 million concurrent connections, what will happen, some curious, and some look forward to.
This time using the handy Netty NIO Framework (netty-3.6.5.final), the package is very good, the interface is very comprehensive, just like its current domain name Netty.io, focusing on network IO.
The whole process has no technical content, simple analysis has become a bit dull and boring, ready to bite the bullet.
Test server Configuration
Run in VMware Workstation 9, 64-bit CentOS 6.2 system, allocate 14.9G of memory around, 4 cores.
The JAVA7 version is installed:
Java version "1.7.0_21" Java (tm) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot (tm) 64-bit Server VM (Build 23.21- B01, Mixed mode)
Test side
Test end and the same program, look at the previous blog can see the client5.c source code.
In/etc/sysctl.conf, add the following configuration:
Fs.file-max = 1048576net.ipv4.ip_local_port_range = 1024x768 65535net.ipv4.tcp_mem = 786432 2097152 3145728net.ipv4.tcp_ Rmem = 4096 4096 16777216net.ipv4.tcp_wmem = 4096 4096 16777216net.ipv4.tcp_tw_reuse = 1net.ipv4.tcp_tw_recycle = 1
Server programs
This is also very simple, no business functions, client HTTP requests, server output chunked encoded content.
Entrance Httpchunkedserver.java:Unique Custom Processor Httpchunkedserverhandler.java:Startup script start.shSome information when 1 million concurrent connections are reached
After each server side reaches 1 million concurrent persistent connection, then shuts down the Test side program, disconnects all connections, waits for the server side log output the online user is 0 o'clock, repeats above steps again. In this iterative situation, observe some of the information such as memory. With the example of disconnecting all test ends, the current system occupies (set to list_free_1
):
total used free shared buffers cached Mem: 15189 7736 7453 0 18 120 -/+ buffers/cache: 7597 7592 Swap: 4095 948 3147
Through top observation, its process-related information
PID USER PR NI VIRT RES SHR S%cpu%MEM time+ COMMAND 4925 Root 0 8206m 4.3g 2776 S 0.3 28.8 50:18.66 Java
In the startup script start.sh, we set the heap memory to 6G.
PS aux|grep java command for information:
Root 4925 38.0 28.8 8403444 4484764? Sl 15:26 50:18 Java-server ... Httpchunkedserver 8000
RSS takes up memory as 4484764k/1024k=4379m
Then start the Test side again, and when the server receives the online user 1023749 , the ps aux|grep java
content is:
Root 4925 43.6 28.4 8403444 4422824? Sl 15:26 62:53 Java-server ...
View current Network information statistics
ss -s Total: 1024050 (kernel 1024084) tcp: 1023769 (ESTAB 1023754, CLOSED 2, ORPHANED 0, SYNRECV 0, timewait 0/0), ports 12 transport total IP IPv6 * 1024084 - - raw 0 0 0 UDP 7 6 1 tcp 1023767 12 1023755 inet 1023774 18 1023756 frag 0 0 0
Look through top
    TOP -P 4925  TOP - 17:51:30 UP  3:02, 4 users, load average: 1.03, 1.80, 1.19 tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie Cpu0 : 0.9%us, 2.6%sy, 0.0%ni, 52.9%id, 1.0%wa, 13.6%hi, 29.0%si, 0.0% st cpu1 : 1.4%us, 4.5%sy, 0.0%ni, 80.1%id, 1.9%wa, 0.0%hi, 12.0%si, 0.0%st cpu2 : 1.5% us, 4.4%sy, 0.0%ni, 80.5%id, 4.3%wa, 0.0%hi, 9.3% Si, 0.0%st cpu3 : 1.9%us, 4.4%sy, 0.0%ni, 84.4%id, 3.2%wA, 0.0%hi, 6.2%si, 0.0%st mem: 15554336k total, 15268728k used, 285608k free, 3904k buffers swap: 4194296k total, 1082592k used, 3111704k free , 37968k cached pid user PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4925 root 20 0 8206m 4.2g 2220 s 3.3 28.4 62:53.66 java
All four cores are occupied and each core is less than average. This is the result of the virtual machine, and it might be better to have a real server. Because not CPU-intensive applications, CPU is not a problem, no need to pay attention.
System memory Status
Free-m total used free shared buffers cached mem:15189 14926 2 0 5-/+ buffers/cache:14864 324 swap:4095 1057 3038
Physical memory is no longer sufficient, taking up 1057M of virtual memory.
Look at the heap memory situation
jmap -heap 4925 attaching to process id 4925, please wait... debugger attached successfully. server compiler detected. jvm version is 23.21-b01 using parallel threads in the new generation. using thread-local object allocation. Concurrent Mark-Sweep GC Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 6442450944 (6144.0MB) NewSize = 629145600 ( 600.0MB) maxnewsize = 629145600 (600.0MB)  &Nbsp; oldsize = 5439488 ( 5.1875MB) newratio = 2 SurvivorRatio = 1 permsize = 52428800 (50.0MB) MaxPermSize = 52428800 (50.0MB) G1HeapRegionSize = 0 (0.0MB) Heap Usage: New generation (eden + 1 survivor space): capacity = 419430400 (400.0MB) used = 308798864 (294.49354553222656MB) free = 110631536 (105.50645446777344MB) 73.62338638305664% used eden space: capacity = 209715200 (200.0MB) used = 103375232 (98.5863037109375MB) free = 106339968 (101.4136962890625MB) 49.29315185546875% used From Space: capacity = 209715200 (200.0MB) used = 205423632 (195.90724182128906MB) free = 4291568 (4.0927581787109375MB) 97.95362091064453% used To Space: capacity = 209715200 (200.0MB) used = 0 (0.0MB) free = 209715200 (200.0MB) 0.0% used concurrent mark-sweep generation: capacity = 5813305344 (5544.0MB) used = 4213515472 (4018.321487426758MB) free = 1599789872 (1525.6785125732422MB) 72.48054631000646% used Perm Generation: capacity = 52428800 (50.0MB) used = 5505696 (5.250640869140625MB) free = 46923104 (44.749359130859375MB) 10.50128173828125% used 1439 interned Strings Occupying 110936 bytes.
Laosheng memory is 72%, more reasonable, after all, the system has processed 1 million connections.
Disconnect all test ends again to see system memory (FREE-M)
Total used free shared buffers cached mem:15189 7723 7466 0 -/+ buffers/cache:7589 7599 swap:4095 950 3145
Recorded as list_free_2
.
list_free_1
And list_free_2
two times after the release of the memory comparison results, the system can be physical memory has been reduced to 7589M, previously but 7597M of physical memory.
In short, our Java test program has a minimum requirement of 7589 + 950 = 8.6G of memory for minimum memory consumption.
GC Log
We set a large string of parameters at the start of the script, whether or not to achieve the target, but also from the GC log to obtain specific results, we recommend the use of Gcviewer.
GC Event Overview:
Other:
Anyway:
Only one full GC was carried out, which was too expensive to pause for 12 seconds.
Partnew became a large stall, causing the entire system to pause for 41 seconds, unacceptable.
Current JVM tuning is mixed, and you have to keep trying.
Summary
Java compared with Erlang, C, the more troublesome things, need to be prepared at the beginning of the program is how much space it needs, in other words, the JVM startup parameters to set the heap memory size, set the appropriate garbage collection mechanism, if the program needs more memory, need to stop the program, edit the startup parameters, And then start again. In a word, it is trouble. The tuning of the JVM alone will have to be continuously and appropriately fine-tuned based on detection, information, logs, and so on.
The JVM needs to specify the heap size in advance, which can be a problem compared to erlang/c.
GC (garbage collection), relatively troublesome, requires constant tuning of JVM parameters based on logs, JVM stack information, runtime conditions
Set a maximum connection target, multiple tests peaked, then release all connections, observe memory usage repeatedly, get a more appropriate system running memory value
Eclipse Memory Analyzer combines jmap to export a stack dump file to analyze the leak, or it's handy
If you want to modify the runtime content, or call it a hot load, the default is not
There will be a better reflection on the real machine.
Spit on the trough for a moment:
JAVA OSGI, rather than Erlang, requires someone to switch ideas, not something that is native, always awkward, and a patchwork of community or business companies is just a way to implement some of the enterprise features that object-oriented thermal loading does not have.
Test the source code and download Just_test.
1 million concurrent connection server notes Java Netty handling 1M connections what happens?