Mysql monitoring and mysql monitoring tools
Mysql monitoring is one of the database monitoring modules, including collection, display, and monitoring alarms. This article describes the main metrics and collection methods of Mysql monitoring.
The logic of Mysql monitoring is similar to that of Redis monitoring. For details, refer to Redis monitoring.
When the DBA adds Mysql monitoring at the front-end, the system will call the automatic scheduling platform interface to send information such as the encrypted account password and ip port of Mysql monitoring to the target, and send the collection Agent at the same time.
1. Collect indicators and commands
1. Mysql service running status
It is agreed that all Mysql services must be bound with ip1 (intranet ip address). Each machine has only one ip1 and can have multiple ports, that is, multiple Mysql servers. The collection program reads the ip port information file to determine whether the server exists.
SockParam = 'ps aux | grep-P "mysqld. * -- port =$ {port} "| grep-oP" -- socket. *\. sock "'# If it is null, the mysql socket configuration on the server port cannot be obtained, check whether the mysql configuration is correct. MYSQL = "/usr/local/mysql/bin/mysql-hlocalhost -- port =$ {port }$ {sockParam}-u $ {user}-p $ {password} "MYSQL_ADMIN ="/usr/local/mysql/bin/mysqladmin-hlocalhost -- port =$ {port }$ {sockParam}-u $ {user}-p $ {password} "curStatus = '$ {MYSQL}-e" show global status "' # If it is null, the mysql status of the server cannot be obtained, check whether mysql runs properly. if [-z "$ {curStatus}"] then portExists = 0 else echo "$ {curStatus}" >$ {curFile} portExists = 1
2. Number of connections
${MYSQL_ADMIN} processlist -v | wc -l
3. Number of threads
grep 'Threads_connected' ${curFile} | awk '{print $2}'
4. Slow Query Count
grep 'Slow_queries' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries, which is equal to the number of slow queries in the last minute. The last data is stored in last. cache.
5. Number of opened tables
grep 'Open_tables' ${curFile} | awk -F ' ' '{print $2}'
6. Number of select executions per second
grep 'Com_select' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
7. Number of delete operations per second
grep 'Com_delete' ${curFile} | grep -v 'multi' | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
8. insert operations per second
grep 'Com_insert' ${curFile} | grep -v 'select' | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
9. Number of update executions per second
grep 'Com_update' ${curFile} | grep -v 'multi' | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
10. Number of replace executions per second
grep 'Com_replace' ${curFile} | grep -v 'select' | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
11. Innodb_rows_deleted executed per second
grep 'Innodb_rows_deleted' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
12. Innodb_rows_inserted executed per second
grep 'Innodb_rows_inserted' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
13. Innodb_rows_read executed per second
grep 'Innodb_rows_read' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
14. Innodb_rows_updated executed per second
grep 'Innodb_rows_updated' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries divided by the time difference, which is equal to the number of executions in the last minute. The last data is stored in last. cache.
15. innodb rows total executed per second
expr ${innodbRowsDeletedPS} + ${innodbRowsInsertedPS} + ${innodbRowsReadPS} + ${innodbRowsUpdatedPS}
Equal to the total number of Innodb_rows _ * executions
16. Number of processing commands per second qps
expr ${mysqlSelectNumPS} + ${mysqlInsertNumPS} + ${mysqlUpdateNumPS} + ${mysqlDeleteNumPS} + ${mysqlReplaceNumPS}
Equal to the total number of the first five mysql commands Com _ *
17. number of bytes received per second (KByte/s)
grep 'Bytes_received' ${curFile} | awk -F ' ' '{print $2}'
The difference between the two slow queries is calculated divided by the time difference, which is equal to the number of executions in the last minute,Unit KByte/s is obtained by dividing by 1024.. The last data is stored in last. cache.
18. number of bytes sent per second
grep 'Bytes_sent' ${curFile} | awk -F ' ' '{print $2}'
The difference between the two slow queries is calculated divided by the time difference, which is equal to the number of executions in the last minute,Unit KByte/s is obtained by dividing by 1024.. The last data is stored in last. cache.
19. Number of locks that can be obtained immediately
grep 'Table_locks_immediate' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries, which is equal to the number of locks that can be acquired immediately in the last minute. The last data is stored in last. cache.
20. Number of times the lock cannot be obtained immediately
grep 'Table_locks_waited' ${curFile} | awk -F ' ' '{print $2}'
The difference between the two slow queries is calculated, which is equal to the number of locks that cannot be obtained immediately in the last minute. The last data is stored in last. cache.
21. Waiting for a row to be locked
grep 'Innodb_row_lock_waits' ${curFile} | awk -F ' ' '{print $2}'
The difference between the two slow queries is calculated, which is equal to the waiting time for a row to be locked in the last minute. The last data is stored in last. cache.
22. Current dirty pages
grep 'Innodb_buffer_pool_pages_dirty' ${curFile} | awk -F ' ' '{print $2}'
23. Number of buffer pool pages to be cleared
grep 'Innodb_buffer_pool_pages_flushed' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries, which is equal to the number of buffer pool pages to be cleared in the last minute. The last data is stored in last. cache.
24. Innodb write log bytes (KByte)
grep 'Innodb_os_log_written' ${curFile} | awk -F ' ' '{print $2}'
Calculate the difference between the two slow queries, which is equal to the number of logs written in the last minute. Divide the number by 1024 to get KByte. The last data is stored in last. cache.
25. memory usage MByte
pid=`ps aux | grep 'mysqld' | grep -Ev 'safe|grep' | awk '{print $2}' `mem=`cat /proc/${pid}/status | grep 'VmRSS' | awk '{print $2}'`mysqlMem=`echo "scale=2;${mem} / 1024" | bc`
Divide by 1024 to get the MByte
26. handler socket processing per second
curHsTableLock=`grep 'Hs_table_lock' ${curFile} | awk '{print $2}'`preHsTableLock=`grep 'Hs_table_lock' ${preFile} | awk '{print $2}'`if [ -n "${curHsTableLock}" ]then hsQPS=`echo "scale=0;(${curHsTableLock} - ${preHsTableLock}) / ${intervalTime}" | bc`else hsQPS=0fi
27. Master-slave synchronization and status
# Master/Slave information # Whether the Slave server slave_running = 'grep' Slave _ running' $ {curFile} | awk '{print $2} ''if [" $ {slave_running}" = "ONA"] then slaveRunning = 1 slaveStatus = '$ {MYSQL}-e' show slave status \ G' 'echo "$ {slaveStatus}" >$ {slaveFile} slaveIoRunning = 'grep' Slave _ IO_Running '$ {slaveFile} | awk-F ': ''{print $2}'' slaveSqlRunning = 'grep' Slave _ SQL _Running '$ {slaveFile} | awk-F ': ''{print $2}'' if ["$ {slaveIoRunning} A" = "NoA"-o "$ {slaveSqlRunning} A" = "NoA"] then slaveRunning = 3 fi secondsBehindMaster = 'grep' Seconds _ Behind_Master '$ {slaveFile} | awk-F ': ''{print $2}'' if ["$ {secondsBehindMaster} A" = "NULLA"] then secondsBehindMaster = 8888 # indicates that the master node never synchronizes the fi # obtains the master database from the database ip master = 'grep' Master _ host' $ {slaveFile} | awk-F ': ''{print $2}'' masterPort = 'grep' Master _ port' $ {slaveFile} | awk-F ': ''{print $2}'' else master = "" masterPort = "" slaveRunning = 0 secondsBehindMaster = 10000 # No need to detect fi
Note:Seconds_Behind_Master, which is used as an indicator to judge the master-slave latency, how does it obtain this value? Why is it questioned by many people? (This section references from the http://blog.chinaunix.net/uid-27038861-id-3686311.html)
Seconds_Behind_Master compares the timestamp of the event executed by SQL _thread with the timestamp (ts) of the event copied by io_thread, and obtains such a difference. We all know that the relay-log and the bin-log of the master database have exactly the same content. When an SQL statement is recorded, the current ts will be recorded. Therefore, the reference value is from binlog, in fact, the master and slave do not need to be synchronized with NTP, that is, the master and slave clock do not need to be consistent. You will also find that, in fact, io_thread actually occurs between io_thread and SQL _thread, and io_thread is actually associated with the master database, so the problem arises, when the I/O load of the master database is large or the network is congested, io_thread cannot copy the binlog in time (without interruption or replication), while SQL _thread can keep up with the io_thread script, in this case, the value of Seconds_Behind_Master is 0, that is, we think there is no latency, but actually it is not, you know. This is why everyone wants to criticize the use of this parameter to monitor the cause of database delays, but this value is not always inaccurate, if the io_thread and master networks are good, this value is also very valuable.
Previously, the Seconds_Behind_Master parameter has a negative value. We already know that this value is the difference between the latest and new ts executed by io_thread and the ts executed by SQL _thread. The former is always greater than the latter, the only possibility is that the ts of an event has an error, which is smaller than the previous one. In this case, a negative value may occur.
28. Detect the heartbeat status of the collection Agent
Original article, reprinted please note the original address http://www.cnblogs.com/lxmhhy/p/6034609.html
Add qq: 1130010617 for knowledge exchange and discussion. Thank you for your cooperation.