Redis Timeout Problem analysis
Redis occupies an increasingly important position in distributed applications, with just tens of thousands of lines of code, enabling a high-performance data storage service. Recently, the CM8 cluster in the dump center had several redis timeouts, but looking at the memory of the Redis machine did not find enough memory, or the memory exchange, after viewing the Redis source, found that in some cases redis will have a time-out condition, the details are as follows.
1. Network. Redis's processing is closely related to the network, which is prone to redis timeouts if there is a flash on the network. If this is the case, you should first check the Redis machine network bandwidth information to determine if there is a flash.
2. Memory. All Redis data is in memory, and when there is not enough physical memory, the Linux OS uses swap memory, causing memory swapping to occur, and Redis timeouts are generated if there is a redis invoke command. This can be done by adjusting the/proc/sys/vm/swappiness parameter to set the amount of physical memory used more than the swap.
int Rdbsavebackground (char *filename) { pid_t Childpid; long long Start; if (Server.rdb_child_pid! =-1) return Redis_err; server.dirty _before_bgsave = Server.dirty; server.lastbgsave_try = Time (NULL); Start = Ustime (); if ((Childpid = fork ()) = = 0) { int retval; /* Child */ if (server.ipfd > 0) Close (SERVER.IPFD); if ( SERVER.SOFD > 0) Close (SERVER.SOFD); retval = RdbSave (filename); if (retval = = REDIS_OK) { size_t Private_dirty = Zmalloc_get_private_dirty (); if (Private_dirty) { redislog (redis_notice, RDB:%zu MB of Memory used by Copy-on-write ", private_dirty/(1024*1024)); } } exitfromchild ((retval = = REDIS_OK) 0:1); } else { /* Parent */ server.stat_fork_time = Ustime ()-start; if (Childpid = =-1) { &nBsp; server.lastbgsave_status = redis_err; redislog (redis_warning, "Can ' t save in Background:fork:%s", strerror (errno)); return redis_err; } redislog (REDIS_NOTICE, "Background Saving started by PID%d ", childpid); server.rdb_save_time_start = Time (NULL); server.rdb_child_pid = childpid; updatedictresizepolicy (); return Redis_ok; } return REDIS_OK; /* unreached */}
Procedure 1
There are also special situations that can cause swap to occur. When we use the RDB as a redis cluster persistence, there may be insufficient physical memory (AOF persistence just keeps supporting continuous redis cluster changes and is less likely to cause swap). When using RDB persistence, as shown in Program 1, the main process will fork a subprocess to dump all the data in Redis, and the main process is still serving the client. At this point the main and child processes share the same block of memory, and the Linux kernel uses write-time replication to ensure the security of the data. In this mode, if the client sends a write request, the kernel assigns the page to a new page and marks it as write, writing the request to the page. Therefore, when an RDB is persisted, if there are other requests, Redis uses more memory and is more prone to swap, so using the RDB persistence in a scenario where you can quickly recover can make the conditions of the RDB dump a little more stringent, and of course you can choose AoF, But AoF also has his own shortcomings. You can also use the master-slave structure after 2.6, will be read and write separation, so that the server process is not read and write the scenario occurs 3. Redis single Process processing command. Redis supports UDP and TCP two connections, the Redis client sends information to the Redis server that contains the Redis command, and the Redis server performs the appropriate operation after it has received the information, and the Redis Processing command is serial in the following sequence. First, the server establishes the connection as shown in program 2, and returns the file descriptor after the Socket,bind,listen is created:
SERVER.IPFD = Anettcpserver (SERVER.NETERR,SERVER.PORT,SERVER.BINDADDR);
Procedure 2
For Redis, it needs to handle thousands of connections (up to 655350) and use multiplexing to handle multiple connections. Here Redis provides epoll,select, kqueue to implement, where Epoll (AE.C) is used by default. After getting the file descriptor FD returned by the Listen function, Redis joins the FD and its processing Accepttcphandler functions into the event-driven list. In fact, in the Join event queue, the program 4 event driver joins the socket-related FD file descriptor into the Epoll listener event.
if (server.ipfd > 0 && aecreatefileevent (server.el,server.ipfd,ae_readable, accepttcphandler,null) = = Ae_err) redispanic ("Unrecoverable Error creating SERVER.IPFD File event. "); int aecreatefileevent (aeeventloop *eventloop, int fd, int mask, Aefileproc *proc, void *clientdata) { if (fd >= eventloop->setsize) { errno = Erange; return AE_ERR; } aefileevent *fe = &eventLoop->events[fd]; if (Aeapiaddevent (EventLoop, FD, mask) = =-1) return AE_ERR; fe->mask |= mask; if (Mask & ae_readable) Fe->rfileProc = Proc; if (Mask & ae_writable) Fe->wfileproc = PROC;&NBsp; fe->clientdata = clientdata; if (fd > EventLoop->maxfd)        EVENTLOOP->MAXFD = Fd; return AE_OK;}
Program 3
static int aeapiaddevent (aeeventloop *eventloop, int fd, int mask) { aeapistate *state = EventLoop ->apidata; struct epoll_event ee; /* If The fd was already monitored For some event, we need a mod * operation. Otherwise We need an ADD operation. */ int op = eventloop->events[fd].mask = = Ae_none? epoll_ctl_add:epoll_ctl_mod; ee.events = 0; mask |= eventloop->events[fd].mask; /* Merge old events */ if (Mask & ae_readable) ee.events |= epollin; if (Mask & ae_writable) ee.events |= epollout; ee.data.u64 = 0; /* Avoid valgrind warning */    EE.DATA.FD = fd; if (Epoll_ctl (state-> Epfd,op,fd,&ee) = =-1) return-1; &nbsP; return 0;}
Program 4
After all event drivers are initialized, as shown in program 5, the main process obtains IO-ready file descriptors and their corresponding handlers according to numevents = Aeapipoll (EventLoop, TVP), and processes the FD. The approximate process is accept ()->createclient ()->readqueryfromclient (). where Readqueryfromclient () reads the information in the Redis command, ProcessInputBuffer ()->call () to finalize the command.
void Aemain (Aeeventloop *eventloop) { eventloop->stop = 0; while (! Eventloop->stop) { if (Eventloop->beforesleep! = NULL) eventloop->beforesleep (EventLoop); aeprocessevents (EventLoop, ae_all_events); }} int aeprocessevents (aeeventloop *eventloop, int flags) {------------------------------- numevents = Aeapipoll ( EventLoop, TVP); for (j = 0; J < Numevents; J + +) { & nbsp; aefileevent *fe = &eventLoop->events[eventLoop-> Fired[j].fd]; int mask = eventLoop-> Fired[j].mask; int FD = EvEntloop->fired[j].fd; int rfired = 0; /* Note the Fe->mask & mask & ... Code:maybe an already processed * Event removed an element that fired and we still didn ' t * processed, so we check if the event is still valid. */ if (Fe->mask & Mask & AE_ Readable) { rfired = 1; fe- >rfileproc (Eventloop,fd,fe->clientdata,mask); } &nbsP; if (Fe->mask & Mask & ae_writable) { if (!rfired | | Fe->wfileProc! = FE->RFILEPROC) fe->wfileproc (Eventloop,fd,fe->clientdata,mask); } processed++; }}
Procedure 5
It can be seen from the above code that Redis uses AE event-driven combined with epoll multiplexing to achieve serial command processing. Therefore, some slow commands, such as Sort,hgetall,union,mget, will make the single command processing time longer, which can cause subsequent command times Out. So we first need to avoid using slow commands from the business, such as to change the hash format to KV self-parsing, the second increase the number of Redis instances, each Redis server calls as few slow commands.
Article from: http://www.searchtb.com/2014/02/redis-timeout.html
Redis Timeout Problem analysis