Diagnosing the performance problem of Safe City Video network

Source: Internet
Author: User
Tags apm

Objective:

650) this.width=650; "src=" http://s2.51cto.com/wyfs02/M00/89/BC/wKiom1ga86aRtPHlAAMqbDVVHVo790.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" top.jpg "alt=" wkiom1ga86artphlaamqbdvvhvo790.jpg "/>"

Safe city is already a relationship between you and me his livelihood engineering, but due to the complexity of its own system, to the operation and maintenance work has brought great challenges. How to guarantee the camera online rate? How to find the problem of video system failure in the system? In one of our project experiences, excellent cloud APM has played a great role in discovering problems, locating faults and so on, helping us to locate the fault of the system smoothly.

Safe City is a large, comprehensive and very strong management system, not only need to meet the public security management, urban management, traffic management, emergency command and other needs, but also to take into account the disaster warning, safety production monitoring and other aspects of the image monitoring needs, but also to consider the alarm, The integration of access control and other supporting systems and the linkage with the broadcasting system.

The video surveillance system, which is at the core of Safe city system, is complex in structure. The system consists of thousands of high-definition cameras, thousands of video systems, hundreds of bayonet systems, and complex storage and management systems that span multiple networks, including 4G, Ethernet, and fiber optic networks. Camera online rate, anytime, anywhere to quickly adjust video, is the entire video system effectiveness of the key indicators.

Recently received customer feedback, the video network looks quite normal, monitoring to see the camera online rate is also very good, each city-level subsystem test results are quite normal, but it is very slow to open the video. Received the situation, gifted cloud immediately organized a technician to go.

>>>> business Request tracking, what's slow?

After a preliminary understanding, we combed the overall structure of the video application platform, the entire application platform is divided into two levels, provincial and municipal level, up to more than 10 sub-systems, here we introduce the main architecture, and select the critical path to listen for mirroring.

650) this.width=650; "src=" http://s5.51cto.com/wyfs02/M01/89/B9/wKioL1ga8-LynMnjAAGxJMDRpQ8947.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 1.jpg "alt=" Wkiol1ga8-lynmnjaagxjmdrpq8947.jpg "/>

Through the installation of the deployment of excellent cloud APM, tracking observation of provincial SIP signaling, horizontal comprehensive comparison of request multidimensional information. We found that the success rate and response time and request volume have a significant relationship, when the request volume increases, the system success rate decreased significantly, and the response time has risen sharply. The relationship between the number of SIP requests and the success rate and response time of the provincial level is as follows:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/89/BC/wKiom1ga8-nAQSbCAAH9GWLlvrA956.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 2.jpg "alt=" Wkiom1ga8-naqsbcaah9gwllvra956.jpg "/>

>>>> one-time business chain tracking, where is slow?

After discovering that the request response is slow, we further use the advantage Cloud APM single-pen tracking function, tracking the single SIP request process, discovering a large number of errors and delays, determine the root cause of the final error in a certain city SIP server, that is, the provincial level to the city to initiate the SIP call process, the city level returned error, the call failed.

650) this.width=650; "src=" http://s2.51cto.com/wyfs02/M01/89/B9/wKioL1ga8_PAU9iAAACzo28VcyM483.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 3.jpg "alt=" Wkiol1ga8_pau9iaaaczo28vcym483.jpg "/>

From the process of a webcam video request, from the provincial level to initiate a video request, to return, the City SIP server response time is too long.

>>>> simulation analysis, why slow?

So far, the problem has essentially been locked at the SIP server end of the city level. We patrol more than 20,000 cameras on the city level. A success rate of 4.4% was found, there was a return, but the error returned 9.6%, no return timeout of 86%.

From the results of the municipal SIP Server Directive monitoring, it is found that the success rate and response time are obviously related to the request volume, and when the request volume rises, the success rate of the system decreases sharply and the response time rises sharply. There are even more than 1 minutes.

650) this.width=650; "src=" http://s2.51cto.com/wyfs02/M02/89/BC/wKiom1ga8_zSlIsoAAHhk06xkiY264.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 4.jpg "alt=" Wkiom1ga8_zslisoaahhk06xkiy264.jpg "/>

What causes the server in the city to process successive requests and to report the error message continuously after only responding to some of the requests? We analyzed the time and state relationship of each response of the municipal SIP server, and finally found that the SIP server did not end the request correctly and freed the resources, resulting in the inability to continue processing subsequent requests.

Things have finally come to an end, but the quest for operations has only just begun. As a result of the general customer's video system is a large number of virtualization, cloud-based system construction, so that the traditional operation and maintenance, fixed-point monitoring scheme in the current system structure is not fully competent. How does OPS keep pace with the agile development of business systems? Fast, agile and dynamic tracking of users ' software architectures, thanks to the cloud-looking operational solution, helps locate and solve problems effectively.

Liu Chengmu

• Senior architect of Cloud software

· More than 10 years experience in the development of it operation and maintenance management software

· Mainly engaged in the research and development work of application performance management

This article is from the "excellent cloud dual-state Operations" blog, please be sure to keep this source http://uyunopss.blog.51cto.com/12240346/1869006

Diagnosing the performance problem of Safe City Video network

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.