Two times since the public survey of the full-service maintenance, have encountered an individual server outage time unusually long situation, specific performance for the shutdown process blocked in MySQL disk, to store data has not been reduced, continued about one hours to two hours after the continued storage, successfully stopped the service, and SQLError error "Lost Connection to MySQL server during query ".
These two days colleagues query some information to figure out the cause of the problem, our server outage moment will be like MySQL sent a lot of data to store, the amount of data is too large to trigger the MySQLnet_read_timeout超时,MySQL报错断开连接,而服务器这边无法直接恢复,只能等待网络连接超时断开之后重连成功了继续存储,最终存储完成停服。
Two modifications, one is to increase the net_read_timeout time-out period according to the official website, the other is to shorten the Odbc.ini ReadTimeout and WriteTimeout time-out. The meaning of the former is that the threshold is increased do not trigger the MySQL error disconnect, the latter is the meaning of the error after disconnection can resume the reconnection as soon as possible to continue storage. The former needs to restart MySQL, there is a risk of maintenance, we decided to first adopt the second solution.
A description of the problem
November 4 Server outage maintenance, 2023 server disk blocking.
November 11 Server outage maintenance, 2028,2096,10018 server disk blocking.
The specific data are as follows:
Server ID |
Error time |
Dbserviceexec slow heartbeat for the longest time when an error occurs |
2028 |
2014-11-11_08:28:01 |
6013009ms (100min) |
2096 |
2014-11-11_07:06:09 |
6000006ms (100min) |
10018 |
2014-11-11_08:42:32 |
6601007ms (110min) |
2023 |
2014-11-04_05:59:47 |
2950005MS (49min) |
The above server abnormal performance is similar, dbserviceexec slow heartbeat time is very long, the following error log appears, the storage is back to normal, the server stopped serving normally.
[UnixODBC] [MySQL] [ODBC 3.51 Driver] [Mysqld-5.5.20-rel24.1-log] Lost connection to MySQL server during query
According to the official website, this problem usually occurs in the storage of large amounts of data, stop to save to meet this situation.
See the Red section for detailed information.
b.5.2.3 Lost connection to MySQL server
There is three likely causes for this error message.
usually it indicates network connectivity trouble and you should check the Condition of your network if this error occurs frequently. if the error message includes "During query," this is probably the case you are experiencing.
sometimes the "During query" form happens when millions of rows are being sent as part of one or more queries. if you know that this is happening, you should try increasing net_read_timeout from its default of 30 seconds to 60 seconds or longer, sufficient for the data transfer to complete.
More rarely, it can happen when the client is attempting the initial connection to the server. In this case, if yourconnect_timeout value is set to only a few seconds, you may be able to resolve the problem by increasing it to Ten seconds, perhaps more if you have a very long distance or slow connection. You can determine whether you are Experiencing this more uncommon cause by using show global status LIKE ' aborted_connects ' . it will increase by one for each Initial connection attempt that the server aborts. you may see "Reading authorizaTion packet " as part of the error message; if so, that Also suggests that this is the solution that you need.
If The cause is none of the those just described, you may be experiencing a problem with BLOB values that's larger than max_ Allowed_packet, which can cause this error with some clients. Sometime you are aner_net_packet_too_large error, and that confirms the need to increase max_allowed_packet.
Two-way solution
1 The official website gives the modification net_read_timeout configuration item. This modification requires a restart of MySQL.
The 2 error log is the client side of MySQL that is reported after the server side has no response timeout and can reduce the response wait time by modifying the client side's timeout configuration item. This modification requires modifying the Odbc.ini configuration file.
The Readtimeout=n,n unit is seconds and the actual wait time is 3*n seconds.
Writetimeout=n,n unit is seconds, the actual wait time is net_retry_count* N seconds, mysql default configuration net_retry_count to 10,tcp/ip default time-out is 10min, so, You can see that the maximum timeout time is 100min. and stopping the abnormal blocking time basically coincide.
Based on the November 10 2000 service log, summarizes the time to write and read data to the database, the recommended configuration item values are as follows:
ReadTimeout = 300, in seconds, the actual wait time is 900 seconds, because each connection processing is millisecond, so the main role is to reduce the wait time-out.
WriteTimeout = 180, in seconds, the actual wait time is 1800 seconds, because each connection processing is millisecond, so the main role is to reduce the wait time-out.
Two configuration items specific information is shown in the following table:
ReadTimeout |
The timeout in seconds-attempts to read from the server. Each attempt uses this timeout value and there is retries if necessary, The effective timeout value is three times the option value. You can set the value of so, a lost connection can be detected Earlier than the TCP/IP close_wait_timeout value of ten minutes. This option works with only for TCP/IP connections, and only for Windows prior To MySQL 5.1.12. Corresponds to the mysql_opt_read_timeoutoption of the MYSQL Client Library. Added in 3.51.27. |
WriteTimeout |
The timeout in seconds for attempts to write to the server. Each attempt uses this timeout value and there is net_retry_countretries if necessary, so the total effective timeout value isnet_retry_count times the option value. This option works with only for TCP/IP Connections, and only for Windows prior to MySQL 5.1.12. Corresponds to the mysql_opt_write_timeout option of the MYSQL Client Library. Added in 3.51.27. |
Modifying the ReadTimeout and writetimeout is mainly to configure the socket read-write timeout, the configuration entry principle is as follows:
+void vio_timeout (vio *vio, timeout_type which, uint milliseconds)
{
#ifdef __WIN__
-ULONG wait_timeout= (ULONG) timeout * 1000;
-(void) setsockopt (VIO->SD, Sol_socket,
-Which? So_sndtimeo:so_rcvtimeo, (char*) &wait_timeout,
-sizeof (wait_timeout));
+ ULONG Wait_timeout;
+ #else
+ struct Timeval TV;
+ #endif/* __win__ */
+ int optname;
+ switch (which)
+ {
+ Case Vio_read_timeout:
+ optname = So_rcvtimeo;
+ Break;
+ Case Vio_write_timeout:
+ optname = So_sndtimeo;
+ Break;
+ Default:
+ return;
+ }
+ #ifdef __WIN__
+ wait_timeout= (ULONG) milliseconds;
+ setsockopt (VIO->SD, Sol_socket, optname, (char*) &wait_timeout,
+ sizeof (wait_timeout));
+ #else
+ tv.tv_sec = milliseconds/1000;
+ tv.tv_usec = milliseconds% 1000 * 1000;
+ setsockopt (VIO->SD, Sol_socket, optname, &TV, sizeof (TV));
#endif/* __win__ */
}