Redis AE Asynchronous Event module __ Distributed storage

Source: Internet
Author: User
Tags cas epoll event listener goto memcached valgrind
First think of a question, why Redis faster than memcached.
General idea: Memcached is completely based on memory, and Redis has persistent preservation characteristics, even if asynchronous, Redis can not be faster than memcached.
The actual test situation is basically: Redis occupies an absolute advantage.


There are two possible reasons:
1, libevent:memcached use, and Redis not selected. Libevent to cater to the universality of the code is huge and sacrificing a lot of performance on a particular platform. Redis has always insisted on the design of small and to rely on library ideas.
2. CAS issues: CAS is a convenient way to prevent competition from modifying resources in memcached.
The CAS implementation needs to set a hidden CAS token,cas equivalent value version number for each cache key, each time the set increments the token.

Therefore, the dual cost of CPU and memory, but to achieve a single 10g+ cache and QPS tens of thousands of these costs will bring some

Subtle performance differences.
Redis the handling of the encapsulation event using the reactor mode, adding the processing of timed events.
Redis processing events are single process single-threaded, while classic Reator mode is serially processed for events.
That is, if an event is blocked for too long, it can cause the entire redis to be blocked.


The following is a brief analysis of the Redis AE event-handling model.


You can see from the code that it mainly supports Epoll, select, Kqueue, and the Solaris event ports.
There are two main types of event drivers available:
1. IO events (file events), including read events and write events with IO.
2, timer events, including a one-time timer and cycle timer.

Basic data structure: @ae. h


Defining file Event handling interfaces (function pointers)

<span style= "FONT-SIZE:18PX;" >typedef void Aefileproc (struct aeeventloop *eventloop, int fd, void *clientdata, int mask);

The Time Event processing interface (function pointer) that returns the timed long
typedef int AETIMEPROC (struct aeeventloop *eventloop, long long id, void *clientdata); c3/>typedef void Aeeventfinalizerproc (struct aeeventloop *eventloop, void *clientdata);

Aemain, invoke the
typedef void Aebeforesleepproc (struct aeeventloop *eventloop) before calling to handle the event;</span>

<span style= "font-size:18px;" >//file Event structure typedef struct AEFILEEVENT {//Read or write, also used to identify whether the event structure is using int mask;/* One of Ae_ (readable|
    Writable)////Read event processing function Aefileproc *rfileproc;
    The processing function of the write event Aefileproc *wfileproc;
Data void *clientdata passed to the above two functions;

} aefileevent;
    The time event typedef struct The aetimeevent {//Time event identifier, which uniquely identifies the time event and is used to delete the time event long long ID; * Long when_sec; /* seconds/long When_ms;
    The last handler for the time event, if set, will be invoked Aeeventfinalizerproc *finalizerproc when the time event is deleted;
    void *clientdata;
struct Aetimeevent *next;

} aetimeevent;
    This is used to save the triggered event typedef struct AEFIREDEVENT {int fd;
int mask; } Aefiredevent;</span> 
<span style= "FONT-SIZE:18PX;" >/* State the event based program/
typedef struct AEEVENTLOOP {
    //maximum file descriptor value
    int maxfd;   /* Highest file descriptor currently registered/
    //File descriptor's maximum number of listeners int setsize;//MAX # of file
    descriptors tr
    Used to detect whether the system time changes (judging standard now<lasttime)
    time_t lasttime;     /* Used to detect system clock skew
    //Registration of file events to be used, where the detach table is implemented as a direct index, that is, through FD, to achieve the separation of events
    aefileevent *events; * Registered
    events//triggered incidents
    aefiredevent *fired/* Fired event * * aetimeevent
    *timeeventhead;
    Stop flag, 1 to stop
    int stop;
    This is data that handles the underlying specific APIs, and for Epoll, the struct contains epoll fd and epoll_event
    void *apidata; * This is used the for polling API specific data
    ////Call the handler function Aebeforesleepproc *beforesleep before calling processevent (that is, if no event is sleeping)
Aeeventloop;</span >


1, Aecreateeventloop

The underlying epoll multiplexing is initialized and then stored in the aeeventloop of void * Type apidata, hiding the underlying implementation.

<span style= "FONT-SIZE:18PX;" >typedef struct Aeapistate {
    int epfd;
    struct epoll_event *events;
} Aeapistate;</span>

<span style= "FONT-SIZE:18PX;" 

    >//ae the underlying data creation and initialization of the static int aeapicreate (Aeeventloop *eventloop) {aeapistate *state = zmalloc (sizeof (aeapistate));
    if (!state) return-1;
    Create setsize epoll_event state->events = zmalloc (sizeof (struct epoll_event) *eventloop->setsize);
        if (!state->events) {zfree (state);
    return-1; } STATE-&GT;EPFD = Epoll_create (1024);
        /* 1024 is just a hint for the kernel/if (STATE-&GT;EPFD = 1) {zfree (state->events);
        Zfree (state);
    return-1;
    } eventloop->apidata = State;
return 0; //Create Event loops, setsize as the number of maximum events, and epoll_event number for epoll aeeventloop *aecreateeventloop (int setsize) {Aeeventloop *even
    Tloop;

	int i;
    The memory space allocated for the struct if ((EventLoop = Zmalloc (sizeof (*eventloop))) = = NULL) goto err;
    eventloop->events = Zmalloc (sizeof (aefileevent) *setsize);
    eventloop->fired = Zmalloc (sizeof (aefiredevent) *setsize); if (eventloop->events = = Null | |
    
    eventloop->fired = = NULL) goto err;
    Initializes up to SetSize event eventloop->setsize = setsize;
    Eventloop->lasttime = time (NULL);
    Eventloop->timeeventhead = NULL;
    Eventloop->timeeventnextid = 0;
    eventloop->stop = 0;
    EVENTLOOP-&GT;MAXFD =-1;
    
    Eventloop->beforesleep = NULL;
    This step creates the data for the underlying IO processing, such as epoll, creating epoll_event, and EPFD if (aeapicreate (eventloop) = = 1) goto err; /* Events with mask = = Ae_none are not set. So let's initialize the * vector with it.
    * for (i = 0; i < setsize i++) Eventloop->events[i].mask = Ae_none;

return eventloop;
        Err:if (EventLoop) {zfree (eventloop->events);
        Zfree (eventloop->fired);
    Zfree (EventLoop);
return NULL; }</span>

This is where the maximum file descriptor is used as the parameter setsize, the eventloop->events that is created later, and eventloop->fired.
To create, that is, the file descriptor as its index, the maximum memory sizeof (aefiredevent) + sizeof (aefileevent) 40 bytes
Multiplied by the total number of FD memory in exchange for O (1) search efficiency is worthwhile.


2, Aecreatefileevent

For creating a file event, you need to pass in a handler for that event, and the corresponding callback function is invoked when the event occurs.
The aefileevent structure that is designed here is to associate event sources (FD), events, and event handlers.

<span style= "FONT-SIZE:18PX;" >//adds a listener event in which the old event static int aeapiaddevent (aeeventloop *eventloop, int fd, int mask) {aeapist) is merged for modification if the corresponding event already exists for the FD.
    Ate *state = eventloop->apidata; struct Epoll_event ee = {0}; /* Avoid valgrind warning//////If The FD was already monitored for some event, we need a MOD * operation. Otherwise we need an ADD operation.
            *//To determine if FD has added an event listener int op = eventloop->events[fd].mask = = Ae_none?

    Epoll_ctl_add:epoll_ctl_mod;
    ee.events = 0; Mask |= eventloop->events[fd].mask;
    /* Merge Old Events */if (Mask & ae_readable) ee.events |= Epollin;
    if (Mask & ae_writable) ee.events |= epollout;
    EE.DATA.FD = FD;
    if (Epoll_ctl (state->epfd,op,fd,&ee) = = 1) return-1;
return 0; //Remove listener static void Aeapidelevent (Aeeventloop *eventloop, int fd, int delmask) {aeapistate *state = eventloop-for the specified event
    >apidata; struct Epoll_event ee = {0}; /* Avoid valgrind warning */INT Mask = eventloop->events[fd].mask & (~delmask);
    ee.events = 0;
    if (Mask & ae_readable) ee.events |= Epollin;
    if (Mask & ae_writable) ee.events |= epollout;
    EE.DATA.FD = FD;
    if (mask!= ae_none) {epoll_ctl (state->epfd,epoll_ctl_mod,fd,&ee);
        else {/* note, Kernel < 2.6.9 requires a non null event pointer even for * epoll_ctl_del. * *
    Epoll_ctl (State->epfd,epoll_ctl_del,fd,&ee);  }//Create a file event and register the event in EventLoop int aecreatefileevent (aeeventloop *eventloop, int fd, int mask, Aefileproc *proc,
        void *clientdata) {if (fd >= eventloop->setsize) {errno = Erange;
    return ae_err;

	///Direct use of FD to get fileevent, which is also used when separating the events (direct index) aefileevent *fe = &eventLoop->events[fd];
    The event adds EventLoop or modifies the original existing (retaining the old) if (Aeapiaddevent (EventLoop, FD, mask) = = 1) return ae_err;
    
    Fe->mask |= Mask; Place the handler for the event in the corresponding position if (mask);
    ae_readable) Fe->rfileproc = proc;
    
    if (Mask & ae_writable) Fe->wfileproc = proc;
    Sets the data Fe->clientdata = Clientdata that will be passed to the event handler;
    if (fd > eventloop->maxfd) eventloop->maxfd = FD;
return AE_OK; }</span>

3, Aeprocessevents

This is the core part that separates the events by epoll_wait and saves them to the fired, for the statement
Aefileevent *fe = &eventLoop->events[eventLoop->fired[j].fd];
The events are directly mapped by the FD that triggers the event to find the structure associated with the event, thus implementing the event dispatch.
The core of reactor is to realize the separation and dispatch of events.

<span style= "FONT-SIZE:18PX;"
    >static int Aeapipoll (aeeventloop *eventloop, struct timeval *tvp) {aeapistate *state = eventloop->apidata;

	int retval, numevents = 0; Wait for event to produce retval = Epoll_wait (State->epfd,state->events,eventloop->setsize, TVP?)
    (tvp->tv_sec*1000 + tvp->tv_usec/1000):-1);

        if (retval > 0) {int J;
        Numevents = retval;
            for (j = 0; J < Numevents; J +) {int mask = 0;

            struct Epoll_event *e = state->events+j;
            if (E->events & Epollin) mask |= ae_readable;
            if (E->events & epollout) mask |= ae_writable;
            if (E->events & epollerr) mask |= ae_writable;
            if (E->events & epollhup) mask |= ae_writable;
            Events triggered by fired array records EVENTLOOP-&GT;FIRED[J].FD = e->data.fd;
        Eventloop->fired[j].mask = mask;
} return numevents; }//Event handler int aeprocessEvents (aeeventloop *eventloop, int flags) {int processed = 0, numevents; If nothing is set, then return directly/* None to do? return ASAP */if (! Flags & ae_time_events) &&! (Flags & ae_file_events))

	return 0; If there is a file event or a time event is set and no dont_wait flag/* Note is set, we want call select () even if there are no * file events to proc  ESS as long as we want to process time * events, in order to sleep until the next time event are ready * to fire.
        */if (eventloop->maxfd!=-1 | | (Flags & ae_time_events) &&! (Flags & ae_dont_wait)))
        {int J;
        Aetimeevent *shortest = NULL;

        struct Timeval TV, *TVP; if (Flags & ae_time_events &&!) (
        Flags & ae_dont_wait)//Lookup time the earliest time event shortest = Aesearchnearesttimer (EventLoop);

            if (shortest) {long now_sec, Now_ms;
            Aegettime (&now_sec, &now_ms);

            TVP = &tv; * How many miLliseconds we need to a wait for the next * time event to fire? * Long Long ms = (SHORTEST-&GT;WHEN_SEC-NOW_SEC) *1000 + shortest->when_m
			S-now_ms;
                Finding the earliest time event and the current difference is epoll wait time if (ms > 0) {tvp->tv_sec = ms/1000;
            Tvp->tv_usec = (ms% 1000) *1000;
                else {tvp->tv_sec = 0;
            tvp->tv_usec = 0; } else {/* If we have to check for events but need to return * ASAP because of Ae_dont _wait we need to set the timeout * to zero */if (Flags & ae_dont_wait) {TV.
                tv_sec = tv.tv_usec = 0;
            TVP = &tv;
                else {//If there is no time event, you can block, if you add a timer event at this time, when to wake up ... /* Otherwise we can block */TVP = NULL; /* Wait forever/} numevents = Aeapipoll(EventLoop, TVP);
            for (j = 0; J < numevents J + +) {aefileevent *fe = &eventLoop->events[eventLoop->fired[j].fd];
            int mask = eventloop->fired[j].mask;
            int FD = eventloop->fired[j].fd;

	    int rfired = 0;  /* Note the Fe->mask & mask & ... Code:maybe an already processed event removed a element that Fired and we still didn ' t * processed, so we check if the event is still valid.
                */if (Fe->mask & mask & ae_readable) {rfired = 1;
            Fe->rfileproc (Eventloop,fd,fe->clientdata,mask);  if (Fe->mask & mask & ae_writable) {//The judgment here is to prevent repeated calls to the IF (!rfired | |
            Fe->wfileproc!= Fe->rfileproc) Fe->wfileproc (eventloop,fd,fe->clientdata,mask);
        } processed++; }/* Check time Events */if (flAGS & ae_time_events) Processed + = Processtimeevents (EventLoop); return processed; /* Return the number of processed file/time events * *}</span>

summarized as follows:
1. Reactor mode, serial processing events
2. With timed event function (but not too much, because it is implemented using linked lists), O (N) complexity
3. Priority processing of reading events


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.