Original: 0. SQL Server Monitoring Checklist
The monitoring of database servers can be broadly divided into two categories:
(1) Status monitoring: Is the database server running in a healthy way?
(2) Performance monitoring: Healthy operation at the same time, there is no performance problems? Could it be quicker?
a . Server
1. Status Monitoring
(1) is the server accessible?
(2) is the database service enabled?
(3) Errors or alarms in the operating system event log
(4) Disk free space
2. Performance Monitoring
(1) IO pressure
(2) Memory usage
(3) CPU usage
(4) Network bandwidth consumption
This 1,2,3,4 is in the order of the bottleneck, due to the disk read and write speed limit, usually Io is the most prone to bottlenecks, we do many optimizations are also for the IO, such as: Index optimization, read and write separation, and so on.
two . Database
1. Status Monitoring
(1) database can be opened (database status)
(2) Errors or alarms in the SQL Server/sql Server Agent error log
(3) database/Filegroup free space
(4) SQL Agent job Run status
(5) Database backup has no success
(6) Results of database restore tests
(7) Results of database consistency check (DBCC CHECKDB)
The following status monitoring, usually need to compare with the system average/baseline value is meaningful, otherwise there is no alarm standard.
(8) Number of connections, number of requests, number of transactions, number of threads
(9) Size of database/File/table
(10) Table usage, number of rows
2. Performance Monitoring
(1) There is no long-running query (generally referred to as not being blocked by any request, inefficient query)
(2) There are no blocked queries (may run very quickly, but together with other requests, it takes a long time to have a lock waiting)
(3) There is no deadlock (developer/account "deadlock" is usually blocking/waiting, database deadlock usually rarely makes the user feel waiting, generally the request is interrupted, because the kill dropped)
(4) There is no waiting (generally refers to a variety of resources waiting, waiting and blocking the intersection is the lock wait)
(5) There are no missing/unused/inefficient indexes, and index fragmentation
(6) There are no outdated statistical information
(7) There is no contention for database files (for example: Log files, tempdb contention)
(8) There is no CPU consuming large, IO read and write more queries (usually IO consumes large, that is, memory consumption of large queries)
three . other
(1). If there is a deployment of highly available policies, there will be mirroring, replication, log shipping, cluster status monitoring;
(2). Some business data have strict consistency requirements, the calibration of business data, it is best to do in the monitoring alarm inside;
(3). The availability of objects such as database/instance options, parameter settings, linked servers, and so on, is usually checked in the health check of each year/quarter, and if not, it can also be placed in a monitored alarm.
Four . How do I deploy monitoring?
1. do not select Scripts / commands for dependencies
To monitor whether the service is started as an example, the script is as follows:
(1) SQL Extended Stored procedure
--parameter 1:querystate Check service status/Start start service/Stop stop service--parameter 2: Service nameexecMaster.dbo.xp_servicecontrol'querystate','MSSQLServer'execMaster.dbo.xp_servicecontrol'querystate','SQLServerAgent'execMaster.dbo.xp_servicecontrol'querystate','SQLBrowser'execMaster.dbo.xp_servicecontrol'querystate','NetLogon'EXECXp_servicecontrol N'Stop'N'SQLServerAgent'EXECXp_servicecontrol N'Start'N'SQLServerAgent'
(2) SQL invoke operating system commands
if object_id('tempdb: #tmp_started_services') is not NULL Drop Table#tmp_started_servicesCreate Table#tmp_started_services (started_servicesvarchar(255))Insert into#tmp_started_services (started_services)execMaster.. xp_cmdshell'net start' Select * from#tmp_started_serviceswhere LTRIM(RTRIM(started_services)) like 'sql%'
If SQL Server does not start, these scripts will not run at all, and how to monitor it?
Perhaps, there will be such a train of thought, when the server is normal, send an email notification, if not received the message that the server is not normal, but if there are many servers, how to know who did not send e-mail it?
2. deployed on a dedicated / Multi-monitor machine
Server status monitoring, whether using a third-party tool or a custom script, is recommended to be deployed on a dedicated monitoring machine to monitor the target machine remotely.
Because: if the server down or failure, may be the local program/script can not run, and how to monitor it?
At last
Based on the above monitoring list, you also need to automate monitoring and alert you when problems are found.
0. SQL Server Monitoring Checklist