Symptom: A subscription thread in the background quits for no reason, causing statistics and monitoring to expire for up to 5 hours
Log:
2015-04-13 05:00:00.256 ERROR [Message SubScribe monitor][subscribemanager.java:127]-subscription thread quits for no reason Com.lingyu.common.core.ServiceException:redis.clients.jedis.exceptions.JedisConnectionException:Unexpected End
of stream.
At Com.lingyu.common.db.Redis.subscribe (redis.java:1439) ~[redis.class:?]
At Com.lingyu.common.db.SubScribeManager.run (subscribemanager.java:125) ~[subscribemanager.class:?] At Java.lang.Thread.run (thread.java:745) [": 1.7.0_65] caused by:
Redis.clients.jedis.exceptions.JedisConnectionException:Unexpected end of stream.
At Redis.clients.util.RedisInputStream.ensureFill (redisinputstream.java:198) ~[jedis-2.6.2.jar:?]
At Redis.clients.util.RedisInputStream.read (redisinputstream.java:180) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.Protocol.processBulkReply (protocol.java:158) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.Protocol.process (protocol.java:132) ~[jedis-2.6.2.jar:?] At Redis.clients.jedis.Protocol.proCessmultibulkreply (protocol.java:183) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.Protocol.process (protocol.java:134) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.Protocol.read (protocol.java:192) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.Connection.readProtocolWithCheckingBroken (connection.java:282) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.Connection.getRawObjectMultiBulkReply (connection.java:227) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.JedisPubSub.process (jedispubsub.java:108) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.JedisPubSub.proceed (jedispubsub.java:102) ~[jedis-2.6.2.jar:?]
At Redis.clients.jedis.Jedis.subscribe (jedis.java:2496) ~[jedis-2.6.2.jar:?]
At Com.lingyu.common.db.Redis.subscribe (redis.java:1435) ~[redis.class:?] ... 2 more
Be Try{}catch (Exception e) {} Unexpectedly still will exit, very puzzled ~ ~
Found this article:
https://github.com/xetorthio/jedis/issues/932
Client-output-buffer-limit was the cause. Redis-server closed the connections, leading to the exceptions.
Client-output-buffer-limit
Client buffer control. In the interaction between the client and server, each connection is associated with a buffer, which is used to queue the response information that waits for the client to accept. If the client is unable to consume the response information in a timely manner, then buffer will be constantly accumulating memory pressure on the server. If the backlog of data in buffer reaches the threshold, it will cause the connection to be closed and buffer removed.
The buffer control types include normal connection, slave, and slave connection, pubsub->pub/sub type connection, this type of connection often produces this kind of problem; Because pub ends up with dense publishing messages, But the sub side may not be consuming enough.
Instruction format: client-output-buffer-limit <class> Soft represents "tolerable value", which mates with seconds, if the buffer value exceeds soft and the duration reaches seconds, the connection is immediately closed, if the soft is exceeded but after seconds, the buffer data is less than soft and the connection is retained.
Where both hard and soft are set to 0, the buffer control is disabled. Usually the hard value is greater than soft.
Adjustment parameters on the production line (memory and Configuration synchronization modifications):
127.0.0.1:6380> CONFIG GET Client-output-buffer-limit
Client-output-buffer-limit
Normal 0 0 0 slave 268435456 67108864 pubsub 33554432 8388608 60
127.0.0.1:6380> config set client-output-buffer-limit ' normal 0 0 0 slave 268435456 67108864 pubsub 0 0 0 '
Redis.conf
Client-output-buffer-limit pubsub 0 0 0
===========================================================
Yesterday, I encountered the Russian region backstage activity upload unsuccessful, our backstage activity is through Redos publish out, but found publish failure, try to publish small byte is no problem, the large byte of failure, info can not be viewed.
Background confirmation The reason is that the network LAN transmission to the size of the packet limit, more than a certain size of packets, can not be transmitted, and caused the connection failure. Open limits, problem resolution.
=============================================================
6470672 people
Average down everyone, 887 bytes of memory occupied, 608 AOF disk occupies 344 RDB disk occupied
The file cache requires an extra half of the memory footprint, so there's probably a 10-g footprint.
=============================================================
Slow query get
Slowlog get
127.0.0.1:6379> Slowlog Get
1) 1) (integer)
2) (integer) 1417531320
3) (integer) 24623
4) 1) "Info"
Among them, the indicators indicate:
A unique progressive identifier for every slow log entry. Slowlog's serial number.
The Unix timestamp at which, the logged command was processed. Unix time stamp
The amount of time needed for it execution, in microseconds on average (note that microseconds translates into microseconds instead of milliseconds).
The array composing the arguments of the command.
Slowlog Len Gets the total number of slow logs
Slowlog get number gets slowlog based on quantity
=======================================
The Commandstats section records execution statistics for various types of commands, such as the number of command executions, the CPU time (in milliseconds) the command consumes, the average CPU time (in milliseconds) spent executing each command, and so on. For each type of command, this section adds a line of information in the following format:
Cmdstat_xxx:calls=xxx,usec=xxx,usecpercall=xxx
10.104.5.98:6379>info commandstats
# Commandstats
cmdstat_get:calls=180608685,usec=470928529,usec_per_call=2.61
cmdstat_set:calls=147550519,usec=562225572,usec_per_call=3.81
cmdstat_del:calls=177224,usec=1643815,usec_per_call=9.28
cmdstat_exists:calls=14130110,usec=31402378,usec_per_call=2.22
cmdstat_incr:calls=1017,usec=3261,usec_per_call=3.21
cmdstat_mget:calls=666034,usec=18069595,usec_per_call=27.13
cmdstat_lpush:calls=103077132,usec=181583996,usec_per_call=1.76
cmdstat_lrange:calls=38777511,usec=138617427,usec_per_call=3.57
cmdstat_ltrim:calls=2056,usec=7622,usec_per_call=3.71
cmdstat_lrem:calls=103075076,usec=579401111,usec_per_call=5.62
cmdstat_zadd:calls=15900133,usec=56515414,usec_per_call=3.55
cmdstat_zincrby:calls=11747959,usec=196212310,usec_per_call=16.70
cmdstat_zrem:calls=257783,usec=1053833,usec_per_call=4.09
cmdstat_zrange:calls=7141527,usec=41950470,usec_per_call=5.87
cmdstat_zrevrangebyscore:calls=10,usec=51489,usec_per_call=5148.90
cmdstat_zcount:calls=16104028,usec=112221789,usec_per_call=6.97
cmdstat_zrevrange:calls=27497771,usec=582807534,usec_per_call=21.19
cmdstat_zscore:calls=8663683,usec=44001575,usec_per_call=5.08
cmdstat_zrank:calls=3,usec=43,usec_per_call=14.33
cmdstat_zrevrank:calls=15906400,usec=68891802,usec_per_call=4.33
cmdstat_hset:calls=10236125,usec=37507245,usec_per_call=3.66
cmdstat_hget:calls=1618802100,usec=2755577270,usec_per_call=1.70
cmdstat_hmset:calls=369619411,usec=4843444966,usec_per_call=13.10
cmdstat_hmget:calls=56015,usec=344231,usec_per_call=6.15
cmdstat_hincrby:calls=170633471,usec=884820311,usec_per_call=5.19
cmdstat_hdel:calls=44233,usec=201881,usec_per_call=4.56
cmdstat_hlen:calls=21724,usec=39834,usec_per_call=1.83
cmdstat_hgetall:calls=311374011,usec=3269118749,usec_per_call=10.50
cmdstat_hexists:calls=70864759,usec=285319509,usec_per_call=4.03
cmdstat_incrby:calls=2942269,usec=42251052,usec_per_call=14.36
cmdstat_decrby:calls=2050,usec=3616,usec_per_call=1.76
cmdstat_rename:calls=6472,usec=33326,usec_per_call=5.15
cmdstat_keys:calls=3636,usec=1974535725,usec_per_call=543051.62
cmdstat_dbsize:calls=9,usec=15,usec_per_call=1.67
cmdstat_ping:calls=46747,usec=61691,usec_per_call=1.32
cmdstat_type:calls=1,usec=3,usec_per_call=3.00
cmdstat_psync:calls=1,usec=3164,usec_per_call=3164.00
cmdstat_replconf:calls=21643928,usec=25568830,usec_per_call=1.18
cmdstat_info:calls=4,usec=3669,usec_per_call=917.25
cmdstat_config:calls=2,usec=37,usec_per_call=18.50
cmdstat_subscribe:calls=45505,usec=476993,usec_per_call=10.48
cmdstat_publish:calls=34572782,usec=262298295,usec_per_call=7.59
cmdstat_client:calls=3,usec=47628,usec_per_call=15876.00
cmdstat_eval:calls=2050,usec=76432,usec_per_call=37.28
cmdstat_slowlog:calls=1,usec=30,usec_per_call=30.00
redis.2.8.23 version deployment will have two warning
[32555] 12:06:37.804 # WARNING You has Transparent Huge Pages (THP) support for enabled in your kernel. This would create latency and memory usage issues with Redis. To fix this issue run the command ' echo never >/sys/kernel/mm/transparent_hugepage/enabled ' as root, and add it to you R/etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
[32555] 12:06:37.804 # warning:the TCP Backlog setting of 511 cannot be enforced BECAUSE/PROC/SYS/NET/CORE/SOMAXC Onn is set to the lower value of 128.
Just follow the prompts for the line:
echo Never >/sys/kernel/mm/transparent_hugepage/enabled
Echo 511 >/proc/sys/net/core/somaxconn
and add it to/etc/rc.local.
==============================================
Redis "Misconf Redis is configured to save RDB snapshots, but was currently not able-persist on disk"
1. Vm.overcommit_memory=1 Not Configured
2. Insufficient disk space
3.rdb file is deleted, workaround
CONFIG SET dir/data/redis/