Responsibilities of the database administrator

Source: Internet
Author: User
Responsibilities and objectives of Database Administrators (1) Responsibilities (2) ensure database security and stable operation 1. Routine Database Operation and Maintenance and Management 1.1 daily database health inspection. 1.2 analyze database performance trends and collect statistics on key performance indicators. 1.3 tracking and trend analysis of important database SQL changes. 1.4 regular (by week,

Responsibilities and objectives of Database Administrators (1) Responsibilities (2) ensure database security and stable operation 1. Routine Database Operation and Maintenance and Management 1.1 daily database health inspection. 1.2 analyze database performance trends and collect statistics on key performance indicators. 1.3 tracking and trend analysis of important database SQL changes. 1.4 regular (by week,

Responsibilities of the database administrator

Objectives

Responsibilities (1)

Responsibilities (II)

Ensure database security and stable operation

1. Routine database operation

Maintenance and Management

1.1 daily database health inspection.

1.2 analyze database performance trends and collect statistics on key performance indicators.

1.3 tracking and trend analysis of important database SQL changes.

1.4 analyze the overall database operation situation on a regular basis (by week or month) and prepare weekly and monthly running reports.

1.5 analyze data growth and audit storage usage.

1.6 handle database faults.

1.7 process data extraction requirements such as audit unload (monthly, quarterly, and yearly) and data query requirements.

Adjustments or changes made to improve database performance

2. database performance optimization

2.1 propose optimization strategies for OS Resource Consumption

2.2 submit the optimization solution for database parameters based on the performance data collected daily and the vendor's suggestions

2.3 By monitoring the data volume change trend, tracking the execution of key SQL statements, and proposing suggestions for restructuring and rebuilding related tables and Indexes

2.4 provide suggestions for the development of hotspot SQL adjustments based on daily trend analysis

2.5 conduct special analysis on exceptions and wait and submit Optimization Solutions

2.6 submit performance improvement suggestions based on changes in various database Loads

2.7 propose adjustment suggestions by analyzing the usage trend of UNDO and TEMP tablespaces

2.8 analyze various types of monitoring alarms and propose optimization suggestions

Other adjustments not aimed at improving database performance

3. Database adjustment and change

3.1 analyze change requirements and communicate with development engineers (hosts and applications) for confirmation.

3.2 sort out change operation steps and implement changes

3.3 emergency handling of errors and exceptions in changes

3.5 confirm the change rollback plan and emergency strategy, and communicate with the manufacturer and industry experts for review

3.6 After the system runs stably, compare and analyze the database running status before and after the change, and supplement and improve the change plan

3.7 adjust daily inspection, performance data collection, emergency database synchronization, system backup and other policies based on system changes after the change, and handle the change if necessary

New System Database Installation

4. New System Database

Installation and debugging

4.1 Database Planning

4.2 database installation and debugging

4.3 Add the system to the daily monitoring Scope

Database Administrator job responsibilities

1. database health status query

Check item 01: Operating system resource status (cpu, memory, IO, network, file system)

Role and significance:During database operation, the resource usage of the operating system directly affects the stability and performance of the database. By monitoring the resource usage of the operating system, you can know the pressure of the database server in a timely manner, so as to further analyze the problems encountered during the database operation.

OS Health Check

Fill in the CPU, memory, disk I/O status, network status, and other information in the morning and afternoon peak check conditions. For CPU, memory, and disk I/O record values, for network conditions, the record is normal or abnormal.

Detect CPU usage

We can see that the load for 1 minute, 5 minutes, and 15 minutes is 0, and the system is healthy.

Corresponding metrics:Cpu, memory, disk I/O status, network conditions

Cpu usage less than 50%

The memory usage is lower than 90%. If it is higher than 90%, you can view the virtual memory usage, as shown in figure

If there is no obvious Memory Page change, it is also considered normal.

Disk IO status,

Network condition: the network traffic is relatively stable.

File System usage: less than 80%

Detect memory status

As you can see, although the memory usage is more than 90%, the SWAP partition is not occupied, so it is a normal business situation.

Detect the system by using the vmstat command

Linux memory monitoring vmstat command output is divided into six parts: (reference: http://blog.csdn.net/lengyuhong/article/details/5855056)

1. Process procs:

R: Number of processes waiting in the running queue.

B: Number of processes waiting for io.

2. Linux memory monitoring memory memoy:

Swpd: available swap memory (unit: KB ).

Free: idle memory (unit: KB ).

Buff: the number of buffer memories (unit: KB ).

Cache: The amount of memory used as the cache (unit: KB ).

3. Linux memory monitoring swap switching page

Si: the number of pages exchanged from disk swap to memory. Unit: KB/second.

So: the number of pages exchanged from the memory to the disk. Unit: KB/second.

4. Linux memory monitoring I/O BLOCK devices:

Bi: number of blocks sent to the block device. Unit: block/second.

Bo: number of blocks received from the block device. Unit: block/second.

5. Linux memory monitoring system:

In: The number of interrupts per second, including clock interruptions.

Cs: the number of environment (context) Conversions per second.

6. Linux memory monitoring cpu central processor:

Cs: the time used by the user process. Expressed as a percentage.

Sy: the time used by the system process. Expressed as a percentage.

Id: the idle time of the central processor. Expressed as a percentage.

If r is often greater than 4 and the id is often less than 40, it indicates that the central processor is under heavy load. If bi and bo are not equal to 0 for a long time, the physical memory capacity is too small.

Detect the system using the iostat command

Important parameters (reference: http://blog.csdn.net/wyzxg/article/details/3985221)

% Util: The amount of time in one second for I/O operations, or the amount of time in one second for I/O queues is not empty.

Svctm: Average service time for each device I/O operation

Await: average wait time for each device I/O operation

Avgqu-sz: Average I/O Queue Length

If % util is close to 100%, it indicates that there are too many I/o requests, the I/o system is fully loaded, and the disk may have a bottleneck. Generally, if % util is greater than 70%, the I/o pressure is relatively high, wait has a high read speed. you can also view the parameters B (number of processes waiting for resources) and wa in combination with vmstat (percentage of CPU time occupied by IO wait, which is higher than 30% when I/O pressure is high ).

The size of await generally depends on the service time (svctm), the length of the I/O queue, and the mode in which I/O requests are sent. If svctm is close to await, it means that I/O has almost no waiting time. If await is much larger than svctm, it means that the I/O queue is too long and the response time of the application is slow.

Check item 02:Check the background process of the database (Taking oracle as an example)

Role and significance:Oracle background processes are an important part of oracle instances.

It determines whether oracle runs normally or not. If the oracle background process is stopped, the oracle instance will crash.

Run the following command in the operating system:

$ Ps‐ef | grep ora _

Output result execution result analysis:

The returned results should include at least the following processes:

The process of writing data files from Oracle. The output is "ora_dbw0_SID"

The process for writing a log file in Oracle. The output is "ora_lgwr_SID"

The process that listens to the instance status in Oracle. The output is "ora_smon_SID"

Oracle listens to the process in which the client connects to the process. The output is "ora_pmon_SID"

Oracle archiving process, the output is: "ora_arc0_SID"

Oracle Checkpoint Process, the output is: "ora_ckpt_SID"

Oracle recovery process, the output is: "ora_reco_SID"

If the command output contains the preceding background process, the process is recorded as normal in the daily check Status column and the result is recorded as normal. Otherwise, an exception is recorded if a process is missing.

Check item 03:Check the database listening process (using oracle as an example)

Role and significance:The Oracle listening process listens to connection requests sent to the database from the client (such as the application server), and then establishes a connection for the request. If the listening process is disabled, the connection between the application server and the database server cannot be established.

Operation Method:

$ Lsnrctl status

Output result execution result analysis:

"Services Summary..." in the returned result... "item indicates which database instances the Oracle listening process is listening for. The output display should contain at least the instance (SID) required for running the current business system, such as ebank, B2B, AND md. In the daily check, enter the number of database instances being monitored.

Check item 04:Check the oracle instance status

Role and significance:Under normal circumstances, the oracle instance is in the open state. By checking this item, you can determine whether the oracle instance is normal.

Log on to the database as sysdaba and run the following command:

SQL> select instance_name, version, status, database_status from v $ instance;

Output result returned result:

"STATUS" indicates the current Oracle instance STATUS, which must be "OPEN ";

"DATABASE_STATUS" indicates the current Oracle Database status, which must be "ACTIVE ".

Check item 05:Check the oracle database status

Role and significance:You can check the database name, archive mode, and open mode to check whether the database is in normal state.

Log on to the database as sysdba and run the following command:

SQL> select name, log_mode, open_mode from v $ database;

Output result execution result analysis:

In the returned results, "NAME" indicates the database NAME; "LOG_MODE" indicates the current Oracle archiving method, and "ARCHIVELOG" indicates that the database runs in archive mode, "NOARCHIVELOG" indicates that the database runs in non-archive mode; "OPEN_MODE" indicates that the Oracle open mode, "read write" indicates that the database can be READ and written, and "read only" indicates that the database is READ-ONLY. Under normal circumstances, the open mode should be "read write ".

Check item 06:Database session Status

Role and significance:The current number of sessions is the number of connections established between the application server and the database server. This number is closely related to the database load. In a stable running database, the number of sessions should be stable. If the number of sessions increases or decreases significantly, the problem may occur. You need to find the cause further.

Run the following command in sqlplus:

SQL> select count (*) from v $ session;

Output result analysis:

The returned result is the number of sessions in the current database. Compare it with the normal value based on the check result.

Check item 07:Number of currently locked sessions

Role and significance:By viewing the lock Wait Status of the session, you can see whether there are lock resources in the database for use now

For example, if a long lock wait occurs, the normal operation of the business may be seriously affected. Therefore, if the database lock wait problem is found during the check, you need to further check the cause of the problem.

Run the following command in sqlplus:

SQL> select count (*) from v $ session where lockwait is not null;

Output result execution result analysis:

If a record is returned in the returned result, it indicates that there is a lock wait in the current session, and the quantity value is the returned result. If there is a lock wait phenomenon, it indicates that there is a problem, and then further find the specific reason for the lock wait.

Check item 08:Check the basic database status-check the oracle warning File

Role and significance:The Oracle warning file is composed of messages and a series of error messages. When a problem occurs during database operation, the corresponding error information will be recorded in the warning file, by checking the warning file, you can promptly discover database running problems.

Output result execution result analysis:

Open the warning file with vi, and check the last Error message from the end of the file to check whether there are records of errors such as "ORA-", "Error", and "Failed.

Check item 09:Check the basic database status-check the oracle kernel dump directory

Role and significance:Core Dump is a concept of UNIX operating systems. When the process is invalid

Errors that cannot be corrected (such as floating-point overflow or memory segment errors). A log file is generated after the process ends abnormally, this file contains all memory information, stack information, CPU register information, and so on when the process is terminated. If the files in this directory grow rapidly in a short period of time, it indicates that the Oracle process often has a core dump during this period, and frequent core dump, especially the core dump of the database background process.

This will cause the database to terminate abnormally.

View core_dump_dest

Check whether new files are generated in this directory.

Output results:Check whether a new kernel dump file is generated. If any record exception occurs.

Check item 10:Check basic database status -- check the database object status

Role and significance:The control file is an important part of the database. If all the control files are damaged, the database cannot be opened. By checking the Control File status, you can find the damaged control file in time, promptly take remedial measures.

Log on to the database as sysdba and run the following command:

SQL> select * from v $ controlfile;

The output result is returned as follows:

"STATUS" should be blank. If the STATUS is empty, the control file is in normal state. In the daily check box, enter the correct information.

Check item 11:Check basic database status -- check the oracle redo log file status

Role and significance:Redo log files are an important part of the oracle database. By checking the redo log files, you can promptly detect problems such as damage to the redo log files.

Log on to the database as sysdba and run the following command:

SQL> select * from v $ logfile;

The output result is returned as follows:

The status value is recorded in the daily check results of the daily check table. It is normal if it is not "INVALID" or "DELETED". Otherwise, an exception occurs.

Check item 12:Check the basic database status-check the oracle tablespace status

Role and significance:Under normal circumstances, the tablespace should be in the online State. If the tablespace is in the offline state, the user will not be able to access the tablespace. By checking the table space state, you can promptly find that those tablespaces are in an abnormal state.

Log on to the database as sysdba and run the following command:

SQL> select tablespace_name, status from dba_tablespaces;

The output result is returned as follows:"STATUS" should be "ONLINE"; otherwise, an exception occurs.

Check item 13:Check basic database status-check the status of oracle rollback segments

Role and significance:Check the status of the rollback segment to check the usage of the rollback segment.

Log on to the database as sysdba and run the following command:

SQL> select segment_name, status from dba_rollback_segs;

Output results:

In the automatic management mode, rollback segments are automatically converted and controlled by oracle. Therefore, some of the output results are online and offline, while others are not.

Check item 14:Check the basic database status-check the oracle Data File status

Role and significance:By checking the oracle Data File status, you can promptly discover data files with data corruption.

Log on to the database as sysdba and run the following command:

SQL> select name, status from v $ datafile;

The output result is returned as follows:

"STATUS" should be "ONLINE" (The SYSTEM tablespace is "SYSTEM") except for the SYSTEM tablespace. other statuses are abnormal.

Check item 15:Check basic database status -- check the status of all oracle Objects

Role and significance:Under normal circumstances, the status of all objects in the database should be VALID.

Objects in INVALID status, you need to further check to determine whether re-compilation is required.

Log on to the database as sysdba and run the following command:

SQL> select owner, object_name, object_type from dba_objects where

Status! = 'Valid' and owner! = 'Sys 'and owner! = 'System ';

Output results:

If an object in the invalid state exists, further analysis is required.

Check item 16:Check the basic database status-check the usage of oracle-related resources

Role and significance:By viewing the v $ resource_limit view, you can check whether the database initialization parameter settings are reasonable. If there are unreasonable or beyond the range of parameter settings, you need to consider appropriate adjustments.

Log on to the database as sysdba and run the following command:

SQL> select resource_name, max_utilization, initial_allocation, limit_value from v $ resource_limit;

If the output result is returned:

"LIMIT_VALU"-"MAX_UTILIZATION"> 5 is normal. If

"LIMIT_VALU"-"MAX_UTILIZATION" <= 5 indicates that the Oracle initialization parameters related to "RESOURCE_NAME" may need to be adjusted.

Check item 17:Check the growth of oracle tablespace

Role and significance:The data in the Oracle database is stored in the tablespace. If the tablespace space is insufficient, the database cannot continue to write data, and the database reports an error. Therefore, it is important to check the space usage in a timely manner and ensure that the remaining space in the tablespace is above 20%.

Log on to the database as sysdba and run the following script:

Select a. tablespace_name, sum (a. tots) Tot_Size,

Sum (a. sumb) Tot_Free,

Sum (a. sumb) * 100/sum (a. tots) Pct_Free,

Sum (a. largest) Max_Free, sum (a. chunks) Chunks_Free

From

(

Select tablespace_name, 0 tots, sum (bytes) sumb,

Max (bytes) largest, count (*) chunks

From dba_free_space

Group by tablespace_name

Union

Select tablespace_name, sum (bytes) tots, 0, 0 from

Dba_data_files

Group by tablespace_name)

Group by a. tablespace_name;

View output results:

PCT_FREE is a column. Generally, PCT_FREE> 20% must be kept for each tablespace,

If there is a tablespace PCT_FREE <20%, it needs to be processed immediately.

Check item 18:Check abnormal ORACLE extension objects

Role and significance:Each segment in the database is composed of extent, and each segment can accommodate a limited number of extent, the max_extents column in dba_segments is

Maximum number of extent instances. If the extent number in the segment reaches the limit, the segment cannot be expanded and the database reports an error. Therefore, by checking the extent quantity in each segment, You can promptly discover objects with scaling exceptions in the database, so as to take further parameters to avoid problems that cannot be expanded by segment.

Log on to the database as sysdba and run the following command:

SQL> select segment_name, segment_type, tablespace_name,

(Extents/max_extents) * 100 percent

From dba_segments

Where max_extents! = 0 and (extents/max_extents) * 100> = 95

Order by percent;

Output result returned result:

If a record is returned, it indicates that the extension of these objects is close to the defined maximum extension value. You need to modify its storage structure parameters.

Ii. Database Performance Analysis

Check item 19:Daily database performance analysis (AWR report analysis during business peaks)

Role and significance:AWR reports are an important performance analysis tool provided by oracle. By viewing AWR reports during business peaks, you can analyze the database load and database performance bottlenecks during business peaks. This allows you to promptly discover and solve performance problems during database operation.

Use the SYS user to log on to the database and generate an AWR report.

Analyze the generated AWR report

View database load

View the hit rate of various database performance indicators

View top 5 wait events of the database

If the top 5 wait event in the AWR report is abnormal, such as latch free, enqueue,

Buffer busy wait, etc.), the buffer hit rate decreases significantly, and the top SQL suddenly consumes a large number of SQL wait events, such as logic or physical read, need to be recorded and further analyzed.

Test and analyze the detected performance problems

For example, the top 5 waiting events in the statspack report are found one day. One of the obvious waiting events is buffer busy waits. by querying the v $ waitstat view, we can see that most of these waits are for data block. This means that multiple sessions try to read the same data block from the disk

In the buffer zone, competition occurs. One session executes the read operation, and the other session waits for buffer busy waits. This wait event usually finds the SQL that causes the wait and tries to optimize it.

Log on to the database as sysdaba and run the following SQL statement to capture the database:

Select SQL _text from v $ sqlarea where hash_value in (select

SQL _hash_value from v $ session where sid in (select sid from

V $ session_wait where event like '% buffer busy % '));

3. irregular operations-add data files for tablespaces

1. Table Reorganization

2. Index Reconstruction

3. Database Patching

4. Database Upgrade

5. performance comparison before and after database changes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.