Communication between php and mysql: bitsCN.com
Bugs and bugs, it seems that the last learning process is caused by bugs:
In our production environment of a WebGame, when we accidentally captured the strace packet, we found a large amount of data for communication between php and mysql. In this case, it is normal when the game server is just started. However, if a large number of select SQL queries occur after running for a period of time, there is definitely a problem, the database to be operated is not a configuration database, which means that the program of our programmers experienced illegal operations. The specific results are as follows:
As shown in, php continuously receives the response packet data that reads 3 in the process descriptor, and 3 is the TCP communication link established between php and mysql, this can also be confirmed by the SELECT statement of Line 1. (The raw data is lost. I imitated one. So it is the SQL statement of the configuration Library)
What program is this and what logic is it designed to implement? Why is it necessary to retrieve so much data?
Following the SELECT SQL statement here, I locate the corresponding program segment:
/*** Business logic code */public function SItem ($ roleId, $ baseId ){//... // ############ anyone who writes the following code will be killed. ################## $ this-> dbrRole-> select ('*'); $ this-> dbrRole-> from ('role _ items '); $ this-> dbrRole-> where ('role _ id', $ roleId ); $ this-> dbrRole-> where ('baseid', $ baseId); $ result = $ this-> dbrRole-> get ()-> row (); // It looks like, this seems to be normal. we all think that the framework will give us only one. //...}
From the code point of view, it seems that the programmer wants to obtain the desired data from the corresponding role_id to the role_items table. Therefore, he calls the row method to obtain one. It seems that this is normal. we all think that the framework will give us only one. But in fact, how does the framework handle it?
Let's take a look at the implementation process of the corresponding row method in the framework. By the way, we are an old version of the CodeIgniter framework.
/*** Code of the row-related method in the framework DB drive ***/public function row ($ n = 0, $ type = 'array') {if (! Is_numeric ($ n) {if (! Is_array ($ this-> _ rowData) {$ this-> _ rowData = $ this-> rowArray (0 );} if (isset ($ this-> _ rowData [$ n]) {return $ this-> _ rowData [$ n] ;}$ n = 0 ;} return ($ type = 'object ')? $ This-> rowObject ($ n): $ this-> rowArray ($ n);} // continue to follow up the rowArray method public function rowArray ($ n = 0) {$ result = $ this-> resultArray (); if (count ($ result) = 0) {return $ result;} if ($ n! = $ This-> _ current & isset ($ result [$ n]) {$ this-> _ current = $ n ;} return $ result [$ this-> _ current];} // continue to follow up the resultArray method ### this method is important ### public function resultArray () {if (count ($ this-> resultArray)> 0) {return $ this-> resultArray ;} if (false = $ this-> resulter | 0 = $ this-> recordCount () {return array ();} $ this-> _ dataSeek (0); while ($ row = $ this-> _ fetchAssoc () {$ this-> resultArray [] = $ row; // ########### add _ fetchAssoc () to this array each time () result memory size ##########################} return $ this-> resultArray ;} // continue to follow up the _ fetchAssoc method/*** code of the _ fetchAssoc method corresponding to the driver */protected function _ fetchAssoc () {return mysql_fetch_assoc ($ this-> resulter );}
We can see that the resultArray method of the CodeIgniter framework uses mysql (our php calls the mysql api using the mysql function, which is a bit round, which will be explained later) mysql_fetch_assoc function of to traverse and convert the data in the buffer zone. Copy all the data in the buffer to the $ this-> resultArray attribute, and then determine whether the result of the key required by the row method exists, and then return the data.
That is to say, the framework layer does not just get one entry from the mysql server (the subconscious mysql server) to our caller, but retrieves all the results and returns another one. (Don't spray it first. I will explain it later.) of course, CI is not a mistake. But I think there are better ways to improve it.
This problem was discovered by our group's dietoad (marriage) and a solution was provided. Some people think that this is a programmer's fault. the programmer's SELECT statement does not add limit to limit the number of items. I absolutely agree with this, and I think everyone who writes this code will die.
- Business layer: add limit restrictions to SQL statements for such business requirements
- Framework layer: The framework automatically controls this requirement. in this case, one record is directly returned.
For Solution 1, I wrote a regular expression. after The select () method is called, no code using the limit () method is used in the middle before the row () method is called, as a result, the amount of discovery is not small. Later, we decided to implement the two solutions at the same time to prevent the second case from being missed.
Dietoad provides the following improvements:
/*** // When _ rowData does not exist, it starts from the number of _ rowData records and takes less than $ n records to avoid getting all data from the buffer zone in the resultArray method above, duplicate data, memory usage */public function row ($ n = 0, $ type = 'array ') {if (isset ($ this-> _ rowData [$ n]) {return $ this-> _ rowData [$ n];} if (! Is_numeric ($ n) {return $ this-> rowObject ($ n);} $ ln = count ($ this-> _ rowData ); // continue the last position while ($ ln ++ <= $ n & $ r = $ this-> _ fetchAssoc ()) {$ this-> _ rowData [] = $ r ;} // how many records do I need to read? // prevent warning return isset ($ this-> _ rowData [$ n]) from being null in the record set? $ This-> _ rowData [$ n]: array ();}
At the end of April this year, I wrote another question about the design defect of the CodeIgniter framework, which had a great impact on our game project. later I submitted it to github issues and didn't get a reply, think about it, although the official version 2.1.3 also has this small problem. However, I don't think we will submit this. maybe our practice is also in line with their original design intention. However, we have made improvements in our project.
After this improvement, when we use the memory_get_usage () function of php to observe the results of the two row () methods, it is found that the memory usage has greatly improved (the improvement depends on the data size returned by the SELECT statement ).
It seems that the problem should be solved.
But what do I always think is missing?When I captured the packet again in strace, I found that there was still a lot of data communication, just like the one at the beginning of the article. But why?
I wrote a test code for memory usage as follows:
$ Db = mysql_connect ('2017. 168. xx. xx', 'xxxx', 'xxxx'); $ SQL = 'select * from items '; mysql_select_db ('jv01', $ db); echo 'Select _ DB :', convert (memory_get_usage (), "/n"; // 619.26 kb $ r = mysql_query ($ SQL, $ db); echo 'query _ SQL :', convert (memory_get_usage (), "/n"; // 619.98 kb ### what? After the query, the memory size increases by less than 1 kB? My table contains dozens of MB of data. // sleep (50); // hold the process and do not destroy it, keep the current process memory allocated 1 $ arr = array (); while ($ rs = mysql_fetch_assoc ($ r) {$ arr [] = $ rs ;} echo 'fetch _ RS: ', convert (memory_get_usage (), "/n"; // 27.11 mb ### what? Didn't I just add 1 k? How does the result set of traversal suddenly increase by dozens of MB? What is Nima like? Unset ($ arr); echo 'unset: ', convert (memory_get_usage (), "/n "; // 620.12 kb #### $ arr z accounts for dozens of Mmysql_free_result ($ r); echo 'free _ R: ', convert (memory_get_usage ()), "/n"; // 620 kb ### is the result set only 0.12 kb? Isn't that nonsense? None... Is php statistics not available for mofei buffer data? Does it mean to call the zend memory application function to apply for memory? // Sleep (50); // hold the process. do not destroy it. check whether the memory of the current process is allocated. 2 function convert ($ size) {$ unit = array ('B ', 'KB', 'mb', 'GB', 'TB', 'PB'); return @ round ($ size/pow (1024, ($ I = floor (log ($ size, 1024), 2 ). ''. $ unit [$ I];}/* // The Returned result is as follows: SELECT_DB: 619.26 kbQUERY_ SQL: 619.98 kbFETCH_RS: 27.11 mbUNSET: 620.12 kbFREE_R: 620 kb */
When I see the results, I can't help but understand what? What is your mom like? After the query, the memory size increases by less than 1 kB? My table is dozens of MB of data? After traversing the result set, how does one increase by dozens of MB? What is Nima like? What is the existence of a large amount of data returned by strace? Not applied by the php process?
Later, I executed the above program again, and regularly used system tools such as free,/proc/PID/maps to check the memory usage of the system, and confirmed that the memory usage of the current process exists. It is possible that the memory_get_usage () function does not obtain the memory usage after mysql_query. Due to suspicion, the source code of the memory_get_usage () function is followed up in the end. This function is directly handed over to the zend_memory_usage function for processing.
// This is the code of the php memory_get_usage () function. for details, see Zend_alloc.c line: 2640ZEND_API size_t encode (int real_usage TSRMLS_DC) {if (real_usage) {return AG) -> real_size;} else {size_t usage = AG (mm_heap)-> size; # if ZEND_MM_CACHEusage-= AG (mm_heap)-> cached; # endifreturn usage ;}} // This is the Zend memory allocation function code // Zend_alloc.c line: 2418ZEND_API void * _ emalloc (size_t size ZEND_FILE_LINE_DC ZEND_FILE_LINE_ORIG_DC) {TSRMLS _ FETCH (); if (UNEXPECTED (! AG (mm_heap)-> use_zend_alloc) {return AG (mm_heap)-> _ malloc (size);} return _ zend_mm_alloc_int (AG (mm_heap), size limit );}
Php memory management (Chinese address: php-zend Memory Management Chinese version) is too complicated for the last part, the value of real_size/size of the mm_heap struct is returned. (The two articles are written by laruence. The Chinese address is that the blog of laruence has been unable to be opened recently, which is so awesome)
What is the result set of mysql_query? How do I apply for memory? isn't zend's _ emalloc memory allocation function called? First, we need to clarify the problem of the mysql client class library, that is, which class library should we use? Libmysql or mysqlnd. by checking the compilation parameters, we found that (my virtual machine) is libmysql, and the compilation parameters are as follows:
. /Configure ''-- prefix =/services/php_5.3.19'' -- with-config-file-path =/services/php_5.3.19/etc ''-- with-pdo-mysql =/usr/ bin/mysql_config ''-- with-mysql =/usr/bin/mysql_config'' -- with-mysqli =/usr/bin/mysql_config ''-- enable-bcmath'' -- enable- fpm // The production server is as follows :. /configure ''-- prefix =/services/php'' -- with-config-file-path =/services/php/etc'' -- with-pdo-mysql = mysqlnd'' -- with-mysql = mysqlnd'' -- with-mysqli = mysqlnd'' -- enable-bcmath'' -- enable-fpm
A little messy:
Many terms such as mysql, mysqli, pdo-mysql, libmysql, and mysqlnd are a bit messy. it doesn't matter. a picture makes you clear:
Relationship between mysql, mysqli, pdo-mysql, libmysql, and mysqlnd
Like libmysql, mysqlnd is a driver class library that communicates directly with mysql server. Mysql, mysqli, and pdo-mysql used by php programmers are API interfaces for programmers ..
Continue:
The libmysql class library is officially provided by MYSQL. each PHP compilation is to specify a parameter to determine which connection driver is used by mysql/mysqli/pdo-mysql. In addition, you must first install the mysql client (libmysql class library) to ensure that there is libmysqlclient. so,
Finally, with a try-on mentality, I opened the libmysql source code heavily and finally found the code similar to libmysqlclient applying for memory near line: 120 of Safemalloc. c.
// Libmysql client library Safemalloc. c line: 120/* Allocate some memory. */void * _ mymalloc (size_t size, const char * filename, uint lineno, myf MyFlags ){... /* Test for memory limit overrun. if compiled with DBUG, test for error injection. described in my_sys.h. */if (size + sf_malloc_cur_memory> sf_malloc_mem_limit) IF_DBUG (| limit) {IF_DBUG (if (limit) errno = ENOMEM; limit = 0); irem = 0 ;} else {/* Allocate the physical memory */irem = (struct st_irem *) malloc (ALIGN_SIZE (sizeof (struct st_irem )) + sf_malloc_prehunc + size +/* size requested */4 +/* overrun mark */sf_malloc_endhunc); // the system memory allocation function malloc }...} // The following is the code of the mysqlnd driver. in order to save your efforts to create another code highlighted block, we put it together. // Mysqlnd client library Mysqlnd_alloc.c line: 77/* {_ mysqlnd_emalloc */void * _ mysqlnd_emalloc (size_t size MYSQLND_MEM_D ){... ret = _ emalloc (REAL_SIZE (size) ZEND_FILE_LINE_CC ZEND_FILE_LINE_ORIG_RELAY_CC); // call zend's memory allocation function _ emalloc... if (ret & collect_memory_statistics) {* (size_t *) ret = size; then (STAT_MEM_EMALLOC_COUNT, 1, clerk, size);} TRACE_ALLOC_RETURN (FAKE_PTR (ret ));} /*}}}*/
That is to say, without calling the zend internal distribution function _ emalloc, libmysql cannot record the memory usage to the mm_heap struct, that is, the reason why the memory_get_usage () function of PHP cannot be counted. Well, although I am not very familiar with the source code, it seems that the problem has occurred.
It seems that another question comes to mind at the end. if the memory occupied by the result set saved by libmysql is used, the memory_limit in the php configuration file cannot limit its memory usage? That is to say, if we ideally allocate several php-fpm processes based on the remaining memory of the system to start running, if this happens, there will be insufficient memory, memory occupied by libmysql is not counted... The result is obvious and cannot be limited.
Relationship between libmysql and mysqlnd and memory_limit
Can mysqlnd be used? Does the memory allocation of mysqlnd use the zend _ emalloc function? Yes, that's right. mysqlnd is our savior. In Mysqlnd_alloc.c line: 77, the code is clearly displayed. When compiling php, SAS must use mysqlnd as the class library driver for connecting php to mysql server.
The benefit of Mysqlnd is more than that.
Memory or memory:
I finally found these materials on the website of the "Ten evil capitalism" country. mysqlnd will save nearly 40% of memory usage than libmysql.
Mysqlnd saves 40% of memory usage than libmysql
, And,
The memory_limit parameter is manageable.Oh...
Speed:
A foreign friend gave a test result. the comparison API is mysql/mysqli, and the comparison driver is libmysql/mysqlnd.
- Use the mysqlnd-driven ext/mysqli interface for the fastest speed
- The libmysql-driven ext/mysqli interface is 6% slower.
- The libmysql-driven ext/mysql interface is 3% slower.
The execution time of mysqli under two drivers is also given:
Mysqli_select_varchar_buffered
And...
Mysqlnd also supports various debug debugging and various strace tracking... Also supports .... You can see the advantages of downloading mysqlnd by yourself over libmysql. This ppt was last searched for a long time.
Recommended:
1. we recommend another article about Persistent Connections to mysqlnd: PHP 5.3: Persistent Connections with ext/mysqli.
2. is the cache stored by your application stored by the programmer based on the DB data results, query conditions, and hash values in memcache? Do you want to try automatic implementation? For the plug-in of mysqlnd, try PHP: Client side caching for all MySQL extensions and support memcached, apc, and sqsions.
Return to start:
Some people say that when php calls mysql_query, mysql server returns the result of this query to the buffer zone of the php server. When the program calls functions such as mysql_fetch_assoc/mysql_fetch_row/mysql_fetch_array/mysql_fetch_object, it calls the php_mysql_fetch_hash function to read data in the buffer zone. What if I use the mysql_unbuffered_query () function? If the result set is not directly returned after the query, will it be pulled back when the mysql_fetch_x function is called? This... Do you want the mysql server buffer to store the data? Do you think the client is your own? Other clients also need to connect, especially php. if the mysql_unbuffered_query () function is used, they will put the result set in the buffer zone of mysql server, isn't the memory usage of mysql server increasing... Do you want DBA to kill you?
As mentioned in the manual, mysql_num_rows () and mysql_data_seek () cannot be used on the result set returned by mysql_unbuffered_query (). I almost never used this function. Is this a non-mainstream function?
Some people say that solution 1 saves the memory usage from the result set and assigns the traversal value to the new array. it does not reduce the network data transmission. That's right. you're right. That is to say, solution 2 can only slightly reduce the negative effect of this problem. if it is completely solved, we have to call the program layer correctly to retrieve the required data. (In fact, if the mysqlnd driver is used, our changes have no advantage and cannot save memory. In mysqlnd, the result set reads only the data that references the buffer zone. Libmysql has obvious effect .) The sentence"No one writes this code.".SAS that do not use mysqlnd as the php connection driver are all rogue..
Conclusion:
Mysqli is recommended for APIs, and mysqlnd is recommended for drivers.
Learn new things in a gentle manner?
After returning home, I finally spent several rounds of "defending radishes". apart from several other levels that need to be developed and unlocked, I am playing with the shameful "all clear" + "Golden Radish, one thing that suddenly occurred was that mysqli used mysqlnd in the compilation process of pdo_mysql in php5.3.8, a blog written last year, mysql uses libmysql, and later installed pdo-mysql also uses libmysql .... Specify two connection drivers for the three APIs. Is this the cause of the last error? Although the final compilation parameter solves the problem by coincidence, I didn't understand the real reason at the beginning? Verify it next week... [UPDATE]
Shame and courage?
After writing this study note today, when I went home to play the game, I remembered that mysqlnd was mentioned by laruence. I went back to see how laruence explained mysqlnd and whether I understood it incorrectly, I found that laruence has a link to the Ulf Wendel blog, but I searched the web for N long before I found the article. at the same time, I found that there were a lot of mysqlnd articles on his blog, I sneered and thought I had discovered a large gold mine .... Ah, shame... [Updated at, December 15,]
There are a lot of blind spots for the knowledge points encountered in this learning experience. you will be familiar with them in the future.
Like... Like... There are too many problems at last...
BitsCN.com