mysql| Data | database Is your webapp system using a MySQL database system? Does your customer always complain about the slow feedback of the results of the page? Does the load on your MySQL system always stay in a very high state? This article will provide you with a way to share the load on the MySQL system and a mysql-ha-proxy development project derived from it. Using the methods provided in this article, you will gain the efficient operation of the MySQL system with minimal source code changes.
the first section of the database cluster technology status quo
At present, the application of database cluster system is quite successful, and the application scope is more extensive: Oracle Company's ORACLE9 and IBM company DB2. Oracle9 adopts Shared-storage technology, DB2 chooses Shared-nothing technology, both of which have their own length.
The theoretical basis of the latest database cluster system is distributed computing, which distributes data to each node, and all compute nodes process the data in parallel, summarizing the results. Such a way is undoubtedly the most perfect. However, the full functionality is still not possible.
For Shared-storage and Shared-nothing Technologies, refer to Oracle and the relevant information on the IBM Web site.
section II current database application status
At present, database application situation is roughly divided into two categories, the first category is the amount of data under 100G, database access frequently, request intensive. Mainly is the application of the Web app type, for example: website, forum and so on. The application of these web app types to access the database is characterized by frequent access, the database to accept more than thousands of times per second query, the need to frequently append data, at the same time the response speed of the data requirements are relatively high. The other is used for scientific computing, storage of historical data applications, data volume is often up to hundreds of G. These applications access to the database is characterized by: many for query operations, data are batches, timing, concentrated into the database, the database records very much, accumulated a large number of data, the response speed of the database is not too high demand.
problems exposed in section III
First class applications, because the access is more frequent, and in order to support more access, the WEB server generally uses a load-balanced cluster, but for the database, because the cluster operation can not be implemented, the request per second is increasing, as the server load increases, the response to a single request slower and faster, If the library file is relatively large, there will be a write operation when the lock table time is too long to affect access efficiency things.
The second type of application, mainly the data file is too large, each processing data requires a lot of time, if the wrong statement will need to spend a few hours to redo the query.
section Fourth how to resolve
First of all, we should optimize from hardware, software, program, index, SQL statement, if still can't solve the problem, we should consider the database system cluster (parallel processing).
For the first class of applications, the database server in the normal operation, the load is not high, the application of the database system is still satisfied with the situation. However, when the database system is overloaded, the time to complete the request is lengthened and the system requirements are not reached. Since the load is due to too many requests, we take the way to share the request, so that part of the request to access another server, so that a single server to reduce the load, so as to solve the problem.
For the second type of application, the distributed computing system is needed to solve, the general system is powerless.
section Fifth the solution to the first class of application problem of "Linux+apache+php+mysql"
A solution to a real case:
I have encountered this problem in my work, our Web server is a cluster of three machines of linux+apache+php, MySQL runs on the platform of SUN450,2G memory. Since web traffic is almost full at peak speeds, the loadavg (the number of processes in a running state within a minute) is 10-20, reflecting the fact that a large number of requests are suspended while accessing the database, resulting in a request not being completed and the next request coming in , the last vicious circle. Loadavg will soar to more than 800 in an instant. The database is worse, LOADAVG reached more than 300, the database thread is very much, the CPU busy switching thread state, this time unless restart MySQL, otherwise how will not be good. After the optimization of the SQL statement is not a good solution to the problem, we added a database server, through the MySQL data synchronization mechanism, so that the data on both databases remain synchronized, modify a part of the only read operation of the PHP program, so that these programs connect to another database, As a part of the separation of the load, the problem has been preliminarily resolved. But then the business bigger, we added more than one server, modified a lot of programs, separate them from the database read operations, access to different servers.
the sixth section Mysql-ha-proxy proposal
By modifying the program to achieve the separation of the system load, is a very painful thing, the project is vast, and can not be mistaken, because in addition to the primary server can write, modify data, and other servers can only be synchronized through the data to update their own data, so if you write to those databases, the result will be disastrous.
If we can have a program to sort the SQL statements, according to his type (read/write), transmit them to different servers, and then return the results. Using a proxy similar to HTTP, so that we do not need to modify the source program to share the load, if it can be based on the server's load status, or the State of the table (available/locked), to determine which server should be assigned to the request, That's better than what we can do to change the source program.
section seventh how to communicate between MySQL client and server
Looking around, also did not find an article on the MySQL communication protocol, it seems only the source of the analysis of MySQL program. So to find the MySQL 3.23.49 code, open the Sniffer tool. MySQL's communication protocol may have been changed several times, in the 3.23.49 version, the communication protocol version is unexpectedly 10.
A simple analysis of the communication protocol, now structured as follows, some places are not perfect, because I really do not have much time to study the MySQL code, I only understand these.
Offset Regional type Length (byte) Description 0 Head Data Length 3 1 2 3 FLAG 1 = 0 General Information
= More than 1 pieces of information
=2 Certification return
>2 paragraph End word 4 DATA CMD Code 1 5 Message DataLength-1
when flag=0, 2, the definition of CMD Code and message
Cmdcode type structure of the message of Status Code Offset type Length (Byte) 0 affect rows 2 0A Server version number Offset type Length (Byte) The server will immediately return information about a data section if it is valid only when the server is connected . 0 versionstring 8 End of ' I ' 8 Session ID 4 32bits of UnKnown One FF return information When an error occurs Offset type Length (Byte) 0 Errcode 2 2 errmsg End FE End of multi-segment information transmission Empty
client's format for submitting data to server:
Offset Regional type Length (Byte) 0 Head Data Length 3 1 2 3 Compressed 1 4 DATA Command ID 1 5 Command Data Data Length-1
Description of command ID and command data:
ID type Data Format 0 Com_sleep 1 Com_quit NULL 2 com_init_db Database name 3 Com_query Stand query string 4 com_field_list table name [128] wildcard[128] 5 com_create_db Database name 6 com_drop_db Database name 7 Com_refresh Options (BITS) 8 Com_shutdown NULL 9 Com_statistics NULL Ten Com_process_info NULL One Com_connect of Com_process_kill Sid[4] of Com_debug NULL - com_ping NULL - Com_time - Com_delayed_insert - Com_change_user [User][passwd][db] A Com_binlog_dump - Com_table_dump - Com_connect_out
Section Eighth how the client authenticates through the server
Protocol analysis completed, I tried to get it to work, but the certification part of the problem, Mysql server when the client connected it, will first return to the client a packet containing the protocol version number, version information, SessionID, a 8-byte key , is the reason for this key. The client uses this key to encrypt the password and then sends the user name, password, and the database that needs to be opened to the server, which completes the authentication. I do not know how the client is using this key to encrypt, so I intend to skip the password, I will be the client's packet reorganization, remove password information, I succeeded, but the MySQL users in the cluster are not password, security more or less some problems, However, these servers are placed in the ha behind, there is no external IP address, should not be a problem, but more or less is a flaw.
But I always need to know the user's password is correct? What do we do? Use a dedicated MySQL to complete the password authentication. Install a minimized resource MySQL server is used to do Mysqlauth (dedicated authentication server), when the client connected, will Mysqlauth first packet returned to the client, which of course contains key, Then the client will use this key, encrypted password, the authentication information will be sent back, this time, the Mysqlha system will forward this information to Mysqlauth, and keep a copy, if the certification passed, the retention of that a reorganization, remove the password information, Then use the reorganized authentication information to connect the servers in the cluster.
[1] [2] Next page