-- Server "**" has shut down the connection prematurely Case Study

Source: Internet
Author: User

During the performance test in recent days, it is found that LoadRunner controller often Reports Server "**" has shut down the connection prematurely. The probability is very high, and the phenomenon is very strange. There are many sayings on the internet, each of which is different, but it seems that they are not correct and can only be tracked by themselves.
After careful analysis based on experience, it may be related to the following factors:
(1) Insufficient Nic resources of the LoadRunner client server;
(2) The timeout time for TCP/IP or HTTP connection keepalive connections is too long, resulting in no connection available;
(3) There is a problem with the application server.

I. Detailed comparison with facts:

Analysis: From the comparison results, the shut down ratio is indeed related to the LoadRunner client. However, no matter how the client changes, this phenomenon still occurs, and the ratio always exceeds one thousandth.

 

LoadRunnerNumber of servers

TcptimedwaitdelayKey Value

Concurrent users

Average TPS

Shut downProportion

1 unit

30 s

13

76.195

18.4 In ten thousandth

1 unit

10 s

7

66.49

10.8 In ten thousandth

2

10 s

7

85.994

1.39 In ten thousandth

2

10 s

2

33.544

1.23 in ten thousandth

 

At this point, you can exclude the cause of the LoadRunner client.

2. Switch to the server. On the DPM server, Apache occupies a large amount of resources and reports an error:
(1) under pressure, the physical memory occupied by Apache (httpd process) increases by 4 MB per second on average;
(2) There are three types of errors in Apache logs:
A. [Tue Jun 30 18:54:37 2009] [Error] [client 192. 168. **. **] unable to init zlib: deflateinit2 returned-4: URL/Distributor/product/my_product_list.htm
B. [Tue Jun 30 18:54:38 2009] [Notice] child PID 28699 exit signal segmentation fault (11)
C. Memory Allocation failed.

Analysis: After observation, It is inferred that the HTTPd process occupies an increasing amount of physical memory, resulting in no remaining resources allocated to the server, resulting in memory allocation failed.

3. Modify and shield some Apache configuration items, such as reducing the space occupied by sendbuffersize and shielding customlog. It does not help.

What is the problem? For more information, see the next blog on this topic.

As mentioned in the previous blog, the server shut down phenomenon was found during the performance test. After tracing, it was found that Apache sub-processes were eating out of memory.

Based on experience, it is possible that Apche loads a certain module. Therefore, the analysis method "split the problem and isolate the analysis" is used. Isolate all modules loaded by Apache. Then, the isolation scope is gradually reduced by commenting, restarting, and verifying. Finally, the bottleneck is identified.

When apache loads a Taobao ** _ module, it consumes 4 MB of memory per second. As a result, the physical memory occupied by Apache continues to increase. When it reaches the maximum memory allocated to Apache by the operating system, the Apache sub-process is dead. In the time interval between the death of the old child process and the creation of the new Child process, when there is a request, the system naturally does not respond. From the LoadRunner end, it is the server shut down.

The truth can be done by Daming. The next step is to optimize this module.

A 1.84/error is behind such a huge performance problem. If you do not study the problem, the problem will soon be ignored. After the system is launched, users who are not favored by God may not be able to open the webpage. The entire bottleneck search process reminds us of the following points:

1. The performance testing engineer must have a keen observation and a low probability. As long as an error occurs, he must go into details;

2. The performance testing engineers should have a clear idea of what to look up first, and then what to look up later. The design should be very clear;

3. In addition to focusing on JBoss and Java programs, Apache should also focus on the issue, especially when an error occurs;

4. The "splitting problem, Isolation Analysis" method is indeed very practical;

5. Do your best to believe in books, and analyze the specific problems.

-- Server "**" has shut down the connection prematurely Case Study

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.