標籤:request failed failure mysql nested
在部分12580項目中,我引入了haproxy做為軟負載。主要為一些非關鍵性的業務提供服務。架構比較簡單。兩台web前段,通過haproxy調用後端的兩台mysql(slave),主要用於搜尋。不寫入資料。
自從上線後,開發人員就偶爾反饋說,有時開啟web頁面查詢資料,報錯如下:
HTTP Status 500 - Request processing failed; nested exception is org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
type Exception report
message Request processing failed; nested exception is org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
description The server encountered an internal error that prevented it from fulfilling this request.
exception
org.springframework.web.util.NestedServletException: Request processing failed; nested exception is org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
一開始以為是前端web伺服器的spring組件問題。後來跳過haproxy直接連接mysql資料。沒有問題。說明通過haproxy才出現此問題。
下面是haproxy的配置,其實非常簡單:
frontend acdb_ms3306
bind 192.100.2.247:3306
mode tcp
option tcplog
maxconn 20000
default_backend backacdb_ms3306
backend backacdb_ms3306
mode tcp
balance roundrobin
server web01 172.200.2.3:3306 check inter 2000 fall 3
server web02 172.200.2.5:3306 check inter 2000 fall 3
百思不得其解。只能查看日誌。內容大致都是如下的各式:
192.100.2.43:50597 [30/Jun/2015:21:57:40.094] acdb_ms3306 backacdb_ms3306/web01 1/1/83378 1350324 cD 2/2/2/1/0 0/0
192.100.2.41:56074 [30/Jun/2015:21:57:40.118] acdb_ms3306 backacdb_ms3306/web02 1/1/69986 10387 cD 3/3/3/1/0 0/0
沒有發現特別的地方。後來翻牆google了好多地方。發現一片文章說,修改參數tmeout client。haproxy的日誌有個標示就改變了。回頭在看看我的,和他一樣。在上面的日誌中。存在cD標誌。但不知道什麼意思。搜尋haproxy配置文旦,說明如下:
cD The client did not send nor acknowledge any data for as long as the
"timeout client" delay. This is often caused by network failures on
the client side, or the client simply leaving the net uncleanly.
那就設定長一點吧。官方解釋如下:
timeout client <timeout>
Set the maximum inactivity time on the client side.
May be used in sections : defaults | frontend | listen | backend
yes | yes | yes | no
Arguments :
<timeout> is the timeout value specified in milliseconds by default, but
can be in any other unit if the number is suffixed by the unit,
as explained at the top of this document.
這裡注意,預設單位是毫秒。
我在default位置修改。
timeout client 30s (預設配置為timeout client 1m,這裡單位應該是1毫秒。)
重啟haproxy。再看日誌。發現如下:
192.100.2.41:40963 [30/Jun/2015:22:24:02.904] acdb_ms3306 backacdb_ms3306/web01 1/0/20509 9224 sD 2/1/1/0/0 0/0
192.100.2.41:41893 [30/Jun/2015:22:23:56.246] acdb_ms3306 backacdb_ms3306/web02 1/0/30066 11172 sD 2/1/1/0/0 0/0
標示改為sD,啥意思:
sD The server did not send nor acknowledge any data for as long as the
"timeout server" setting during the data phase. This is often caused
by too short timeouts on L4 equipments before the server (firewalls,
load-balancers, ...), as well as keep-alive sessions maintained
between the client and the server expiring first on haproxy.
好吧,再把timeout server 改長一點
timeout server 30s (預設配置為timeout server 1m,這裡單位應該是1毫秒。)
再看日誌。哈哈。都是ok的。運行了一段時間。再也沒有報串連打不開的情況了。
172.200.1.9:26387 [30/Jun/2015:21:59:40.275] acdb_ms3306 backacdb_ms3306/web01 1/0/4 160 -- 0/0/0/0/0 0/0
172.200.1.9:46303 [30/Jun/2015:21:59:40.280] acdb_ms3306 backacdb_ms3306/web02 1/0/11 6094 -- 0/0/0/0/0 0/0
Haproxy: web前段調用後面的mysql資料,不週期性報錯