[O & M personnel? Why? 04] Ten O & M work experiences for escaping from faults

Source: Internet
Author: User
[O & M personnel? Why? 04] a summary of ten O & M work experiences that have escaped from failures [in O & M? Why? 03: 20 built-in monitoring tools for Linux: w and ps http://www.2cto.com/os/201303/199396.html Faults, DBAs, and O & M personnel are always pain points in your mind and avoid them... [are people in O & M? Why? 04] a summary of ten O & M work experiences that have escaped from failures [in O & M? Why? 03: 20 built-in monitoring tools for Linux: w and ps http://www.2cto.com/os/201303/199396.html Faults, DBAs, and O & M personnel are always pain points in mind. The principle of avoiding failures is as common as below. I would like to share with you: (1) changes should be rolled back and tested in the same environment first. buddha said: every trauma is a mature one. this is a true portrayal of O & M personnel. in a sense, O & M is an experienced discipline, a trial and error discipline that has never been done before, and always please protect the site and give the change a chance to return. (2) be cautious about destructive operations. what is destructive? For example, for Oracle, the statements truncate table_name, delete table_name, and drop table_name can be easily and easily executed, but remember! Even if the data can be rolled back, the cost is very high! For Linux: all data in the current rm-r and its sub-directories will be deleted, and most people who have experienced such failures will give rm an alias rm = 'rm-I '. Similarly, cp and music videos can also have the same options: alias cp = 'CP-I 'alias mv = 'MV-I' (3) after the command is set, the prompt is displayed before the operation. are you in the master and slave databases? Current Directory? Which schema? Session? Time? For example: for Oracle: [plain] idle> set sqlprompt 'RAC-node1-primary @ 10g> 'RAC-node1-primary @ 10g> of course, you can also in glogin. for Linux and bash environment reminders, you can set PS1 to know the current directory, login username, and host information. (4) Can you back up and verify the backup's validity? What if a machine has a planned or unplanned crash for one day? Backup !!! Backup is widely learned and can be divided into cold backup and hot backup according to different dimensions; real-time and non-real-time backup; physical and logical OLTP online services and databases require real-time hot standby? If a developer deletes all data by mistake on a delete without any conditions, in addition to real-time, you also need to have non-real-time backup to restore the database from a logical error. is it okay? No! It is still necessary to verify the validity of the backup. there are always so many times. backup cannot guarantee 100% recovery. simple verification is to find an empty database and restore it. (v) always have a sense of reverence for the production environment. accounting staff were both the same is true for professional ethics training. This should also be the first quality that O & M personnel need to possess when entering the industry, for example: for Oracle, you can run an RDA to inspect the health status of the database in Linux, whether there is password aging, and isolate the Internet. (6) the handover and vacation are the easiest to go wrong. please take over others' changes with caution. and then, then, confirm the change plan. If you don't know what it means, you can't do well. you 'd better prepare a document before you take a vacation, it specifies under what circumstances and contacts who take over the work during the holidays of others, and "can drag and drop" needs to be executed: you must confirm the operation details with the original O & M personnel (VII) to set up alarms and get error information in a timely manner; the tool for setting up performance monitoring and forecasting trend O & M personnel to survive is alarm and monitoring alarm, which allows you to know in a timely manner what exceptions occur in the system, so that you can follow up in time and eliminate faults in cradle monitoring. you understand the historical performance information of the system, learn from the past, know how to replace the system, and make optimization alarms and optimization early. this is the best brother of Yiba water. be cautious, for example, the Oracle storage-level HA solution: the Data Guard master database submitted an order, and the result was switchover. this order was not synchronized to the slave database. then, the seller lost a sales order, the customer, and the company suffered a loss, be paranoid, check, check, and then check someone else: ① when he makes a change, An email will be sent one or two weeks in advance and sent to the relevant person by phone. ② write the script on the test machine and call the review operation steps and scripts. ③ copy the script to the production environment. ④ log on to the corresponding machine, "open, close, open, close" the script ⑤ confirm with the relevant personnel the operation, sequence, time point, possible impact and rollback are all ready. 6. you have to log out of the machine before execution, and then log on to the machine. "open, close" script. 7. Finally, you can run the script in the background, at the same time, log on in another window, ps and view results at any time during the output of the correct posture, shortness of breath and even, eyes dignified. The operator is not tired, but the learner is tired. Simply put, this is a bit of a Zen artistic conception, which coincides with the idea of GNU/Linux. we are always faced with various temptations: new system architecture, new and more intelligent commands and tools, the latest hardware platform, and more fully functional HA software... you can install and test it online. If you want to use it in a production environment, think twice !! If you can use the built-in commands of the system, you do not need to consider other features that can be completed by the script itself, there is no need to look for a feature-rich software to make the built-in character interface of linux more concise and convenient than those complex graphic interfaces ............ in the end, I wish you a smooth operation and maintenance work, with many advantages and disadvantages. % >_<%
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.