The story that comes in front of you: do the worst, and often things will not go to the worst (2)

Source: Internet
Author: User
Tags call back

Continued 《The story that comes in front of you: do the worst, and often things won't go to the worst (1)"

The following happened between and on January 1, September 22 ~ AM

On Monday morning, on the first morning of system E's launch, the Branch was not working yet. Call Project Manager A and ask if the service personnel are in place. The answer is:

  1. Colleagues in the project team arrived at the customer's office at seven o'clock and used the customer's eight o'clock previous hour to test the final production system;
  2. The system has been running normally since;
  3. Colleagues in the project team have been divided into three parts for their work. Colleagues CDW are responsible for system health monitoring, and colleagues 'q1' are responsible for answering questions on QQ or by phone, another colleague is responsible for the development of subsequent functional points;

After learning about these situations, I asked the project team to provide me with the health monitoring data of system E, mainly the current throughput of system E, that is, the average number of requests completed per second, a, the project manager, said he would send me a short message after monitoring.

11: 01Receive the following short message from CDW colleagues: up to now, the number of sessions in the system is 126, the maximum number of sessions is 162, and the total number of sessions is 341.

11: 05I receive a call from my CDW colleague. I am busy and will call back later.

Call Project Manager A to tell the project manager about the number of sessions in the system. In fact, we are most concerned with the system throughput rate and concurrent access volume, that is, the number of requests completed per second, you can see the amount of access to the system, and you can know the system's degree of busyness and normal operation through experience. Project Manager A said that his colleague CDW is monitoring through the console and he is not very familiar with how to view these indicators from the console. My answer is: OK. Let's take a good look at the Weblogic console.

Inform Project Manager,In addition to the system's technical perspective, you can also view the system's throughput from the business perspective.For example, the number of tickets created in the morning and the number of tickets processed are compared with the old system before the launch.Period-over-period comparisonWhat is the trend? The data can indicate the running status of the system.

Project Manager A replied by phone and told me after they collected the data that the system is still running normally.

 

The following happened between and on March 13, September 22 ~ 4: 30 pm

The work on Monday was always busy. First, considering that the system e was just launched, the project team colleagues still had some service progress, so they waited until four o'clock P.M. to ask the Project Manager A about the operation of the system, the answer is:In the morning, after I had reached my phone call, around 12 seconds later, the system went down again.! This phenomenon was discovered by our monitoring staff. It happened to be the end of work at noon, so we will restart the service soon!

When I heard this from Project Manager A, I scolded Project Manager:

  1. Why did such a major fault occur for four hours and I told me by phoneIf we are busy, are you sure you want to hide the problem! This is a warm boiled frog "! # $ % @#! $
  2. If the user complained about us at this time, we wouldn't even know how passive it would be!
  3. Our owners have the right to know. Although the number of users was not large during lunch, our monitoring was timely and our response to the fault was fast! However, we cannot guarantee that there will be no next time. If there is a next time, it will not always be in the silence of the system. Therefore, we must inform the owner that this situation exists, but we are closely monitoring it!
  4. After four hours, it proves that the company has not yet defined and implemented the fault escalation service process. The Project Manager A leads Z to know, and the leaders W and F of Z do not know, how can we promptly mobilize resources for crisis management!

After I scold me, project manager a lead Z has notified the architecture group's colleague dy around half past two, but it hasn't been resolved yet! I want to talk about it as a top priority. I will immediately mobilize resources to solve the problem. In addition, I will continue to monitor the Service Health. Project Manager A needs to write a report to inform the owner that it is hard to do this. (After the event, the Project Manager A did not tell the owner on the same day, but he was lucky enough !)

After talking to a, he called dy's leader F and said that he had just heard dy talk about this thing. I said that he didn't upgrade to us in time, in fact, I missed the best tracking and processing time. In view of the past situation, I am now ready to go to the company and handle the matter as soon as possible.Today, we can only be regarded as "lucky". The fault happened only when it was not during busy business hours, and we were prepared to recover it in time, but we cannot take it lightly because next time, we may not have such luck "!

 

(Supplement: One week before the national day spent time in a busy and busy cycle. Now, memories are organized based on memories and text messages and call records of mobile phones, hoping to reconstruct the whole story, by the way, we can tell you that there are actually world news !)

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.