Three days of small fake adjustment, stop thinking about an optimization experience, the results are good, the process is also tortuous, in a slightly larger company, all need to rely on evidence to speak, the following straight to the point-
First, background:
As a result of the increase in the number of votes, customers to power, especially when the weather is bad, the rapid expansion of outbound calls, the current background processing system in the past 3 years after the ups and downs, the service center of the mm regularly feedback back-Office system in the peak of the business frequent lag and not available.
Second, backstage system card, Bo Master through communication and analysis, feel that the main is so several aspects:
Here's the problem.
1. Code problem, the async asynchronous, synchronous call causes the overall page load time long
2. Because the function is too many, the page has 1000 also has 800, the code inside has not detected the dead loop causes the server CPU to soar, the memory explodes
3. Network problems, service center because of the large number of people, not in the company's headquarters building but in the place where the rent is cheaper, and the service center's current investment is not very large, there are some problems in network bandwidth
Three, get the question of thinking:
For the first question, from the programmer's point of view, or relatively large suspect, but the background most of the functions are the Report class processing page, in fact, there is no need for async, the page loading itself has no actual promotion, there is a suspicion.
For the second problem, bloggers follow up several times, found that in fact, in the Service center card, in fact, the same period of CPU and memory did not happen anything unusual, this irrefutable can be ruled out.
For the third problem, Bo Master to the Service center field experience, indeed in the occurrence of stuttering and not when the headquarters side is not unusual, because the operation of business also need to use the back-end system, they are not actually affected. It's a big suspicion.
Iv. how to deal with the problem:
From the suspicion of the biggest start, contacted the OPS, said that I want them to go to the field to see our service center network whether there is a problem. However, Yun-dimensional Uncle Oral promise, actually did not have a practical action, at that time Bo master a little unhappy, but no way, I can not control them, with this request way to ask their assistance, is actually drag and drop.
In a company with a very lengthy staffing structure, one thing needs to be done, it requires evidence and meaning, and then calls on the other side leaders who need to work together to decide whether they need to do it or not.
The first step, what I need is evidence.
Set up a surveillance system that needs to help me with a few things,
1. Ability to track requests for each request time-consuming and source of requests
2. There is a contrast between the need for a network between headquarters and the service centre
3. The time-consuming requirement of the request is the time taken to initiate the request to receive the response stream
4. Need to compare client response time for each request and server-side response time to match
V. Commencement of work:
1. How to collect data from this monitoring system:
A: The design of a Windows service applet, installed in the need to collect data on the relevant machine, the service is automatically opened each time the boot, every 100 seconds to simulate a page request (the requested page from the customer service feedback from the common page to select five), It is time-consuming to log from the time the request is initiated to the end of the response stream that was received as a whole.
Due to the need to match each client-server response and the need to differentiate the request from the service center or the headquarters machine, the requested time and a unique identity, as well as the identification of the service center and the headquarters, are brought to the server side by the request when the request is initiated.
At the end of the request, the server side will save the unique identity, load time, and identity identification to the Redis server, eventually landing into the database. After the request is completed, the Windows service logs the data collected by the client through the interface to Redis, eventually landing into the database.
2. How to monitor the data in real time after it is received:
A: The data after the collection of data display, bloggers on the side of the use of the comparison can be clearly presented Highcharts JS Chart library.
Highcharts personal use down the feeling is very convenient, data binding diversification and report presentation of a variety of formats, Bo Master can only use the two words to describe.
The most important thing is that the comparison between the various data sources is very intuitive, four words, stand up for the award.
Six, PostScript:
After the monitoring system is set up, everything becomes so clear, the following optimization scheme is very clear, the service center in the peak of the business Report of the load time can only be described, the client request is basically around 4s feedback, and the server-side time-consuming basic stability in 1s, the same period of comparison, The client side of the headquarters is loaded at about 2 seconds, and the server is time-consuming in 1 seconds.
e-mail to the leadership of the current status and optimization of the program, get approval, the OPS brother quickly implemented, and soon load Plus, the network has also been adjusted.
Seven, sentiment:
1. Seemingly difficult to push the problem, find the pain point, you can hit the fatal.
2. The control variable method is always the most robust and convincing way to face the problem.
3. Fantasy is not as good as moving the real, things just to do, in fact, it will not be so difficult
A 500-person background Service site optimization process