Some comprehension of operation and maintenance management

Transport Vimerfi Law, please read the following every day, you can self-examination.
1, nothing is as simple as it looks.
2, everything will be longer than you expected.
3, anything that goes wrong is always wrong.
4, if you're worried about something that's going to happen, then it's more likely to happen
5, if you succeed for the first time, you obviously have done something wrong.
6, when everything is going in one direction, it's best to take a deep look in the opposite direction
7, automatic disappearance of the problem will automatically come back
8, if everyone's thoughts are similar, obviously no one is seriously thinking
9, good beginning, may not have good results, bad start, the results tend to be worse
10, you must always assume that your hypothesis is invalid
11, education cannot acquire intelligence

    does not speak of specific technologies and processes, which are designed to discuss how to reduce human accidents, avoid unknown risks, and develop practical processes. At work, the leader often says that "there is no small matter in operation and maintenance." A small operating error can cause a huge loss. Operation and maintenance personnel need to do is careful, careful, careful, careful, careful, and then carefully.

    As the operator of the company's reputation is the body of this, operation and maintenance want to make achievements in the company really is not easy, in the face of sudden failure, technical support of various departments, there is a huge expense of server costs, to find out some bright spots in the work is indeed not easy. The name of a brand to operate, as an operator, in the daily work we have a lot of trouble, there are many departments need to communicate. How to manage oneself in the company becomes very important. Only with a good reputation, highlighting the importance of their own in the company in an invincible position, there will be rising capital. So not a wide range of technical research is important, communication is also very important, sometimes we solve the problem, but no good communication, and finally did not convert to results. There are times when we have problems that we can't solve, but we communicate well and finally get approval from others. We must achieve the attitude has the result, the communication has the tracking, in short is to achieve the beginning and end.

Operational objectives: safe, stable, efficient, save
    Security, the company's operations should first put security in the first place, security loopholes, Information leakage these will be related to the future development of the company or even survival, in the Internet company's information leakage incidents are a large negative impact on these companies, to restore the impact of these funds on the pay is very big. So security is the priority.
    stability, in the premise of security to ensure the stable operation of the business is our OPS people seriously consider, the stability of the system related to the user experience, the importance of self-evident, Don't repeat it here.
    efficient and efficient use of all resources to maximize their value.
    saving, hardware cost expenditure, is the head of the company's expenditure, how to save the cost from the hardware is the point we are worth considering, we can not make money, but we may save money.

process management
    process is a must in our work, there are many processes in the work, but really follow the process of strict implementation of a few. I believe everyone will know a smile, a lot of processes are used to settle scores, when your work has been wrong, the leadership will turn out the process of your interrogates. This of course is no wonder leadership, because a lot of processes are our own drafting, so we in the formulation process should be more consideration, and to consider the feasibility of the process, but also to enable the leadership to accept it. So what kind of process is a good process? Here is a small story, a famous architect, design Disneyland after three years of careful construction, will soon open to the outside, however, the road between the various attractions how to connect has not determined the final plan. The master let the construction department in the park on the ground to sprinkle grass species, early open, the grass grow out, the park open, visitors can walk on the grass, in the early opening of the Disneyland in the six months, the lawn was trampled many paths, these trails are wide and narrow, elegant nature, then the master let people follow the tread on the trail laying sidewalk. In the end, the master won the World Award for this path.

daily operation
    as operations, the daily maintenance of the server is a very frequent thing, how to do a good job record is necessary. If it is repetitive things to be templated, the process of things to automate, which can greatly reduce the probability of error.
    some special operations need to write the operation steps before the operation, the more detailed the better, you can not have an idea in the heart to the server as you like. The goal is clear, in the mind to anticipate will reduce very big error probability. Be sure to record the results of the operation after the operation is complete.

fault handling
    handling faults is a commonplace for operations, and the time and method of dealing with faults is an important index of each operation and maintenance capacity. The more experience you have, the quicker the process will be, and the more accurate it will be, and the experience here also includes tips for using search engines. In my opinion, intuition is also very important, may be in some obvious hint of failure problems, but there are some log hints blur will appear, intuition will let you go through the fog to find the quickest way to solve the problem. How to improve your intuition, intuition comes from experience, experience from the constant self-learning and try. Do not escape the problem, you can not escape, so it is difficult to accumulate experience.
    here also want to say is the problem after the reply to the mail, since to be the brand to operate, then we hand over to the thing should be a product, what is a good product, Can become a good product should be perfect, impeccable, let people in the heart comfortable. Then we reply to the message should include the following: Problem resolution results, problem causes, problem solving process, potential problems in the future, suggestions and so on.

Technology to reduce man-made accidents
Is that people always make mistakes, as operations to reduce the probability of making mistakes, the best way is to use technology to solve, such as the command line to select the operation, increase the approval process. These need us to improve the automation operation and maintenance platform, operation and maintenance personnel no longer need to log on to the server to do operations, every step of the operation is audited, there are fault-tolerant, there are records. This can greatly reduce the man-made accident.

