Redis源碼-事件庫

來源:互聯網
上載者:User

標籤:源碼   epoll   網路   演算法   

網上看了很多Redis事件庫的解讀,自己也研究了好幾遍,還是記錄下來,雖然水平有限,但是進步總會是有的

網路事件庫封裝了Epoll的操作(當然是指Linux下的多工了),並且實現一個定時器,定時器也是服務端程式的基石,很多問題都需要靠定時器解決

(一)資料結構+演算法構成一個完整的程式,要一窺Redis網路程式庫,需要先從資料結構開始學習

1.整個事件迴圈是用一個全域的資料結構描述的,aeEventLoop

/* State of an event based program */typedef struct aeEventLoop {    int maxfd;   /* highest file descriptor currently registered */    int setsize; /* max number of file descriptors tracked */    long long timeEventNextId;    time_t lastTime;     /* Used to detect system clock skew */    aeFileEvent *events; /* Registered events */    aeFiredEvent *fired; /* Fired events */    aeTimeEvent *timeEventHead;    int stop;    void *apidata; /* This is used for polling API specific data */    aeBeforeSleepProc *beforesleep;} aeEventLoop;
maxfd:維護的註冊事件的最大fd

setsize:事件數目的個數,這也是檔案事件數目組和就緒事件數目組的最大值。對每一個fd進行的所有操作都需要進行邊界檢查

timeEventNextId:每加入一個時間事件,都需要給它一個ID,時間事件鏈雖然不是有序的,但是這個ID是一直自增的,這個就是最大的ID

LastTime:用來修正系統時間的

events和fired:檔案事件,就緒事件

stop:開關

apidata:不同實現代表不同,epoll裡面是這樣一個資料結構

typedef struct aeApiState {    int epfd;    struct epoll_event *events;} aeApiState;

beforesleep:每次進入主迴圈都要執行的,這裡會做很多的事情,具體後面會遇到

2.檔案事件

/* File event structure */typedef struct aeFileEvent {    int mask; /* one of AE_(READABLE|WRITABLE) */    aeFileProc *rfileProc;    aeFileProc *wfileProc;    void *clientData;} aeFileEvent;
可以看到檔案事件維護了回調和相應fd的標誌,這裡可能就會好奇為什麼沒有維護fd呢,因為fd是存在就緒事件結構中的

3.就緒事件

/* A fired event */typedef struct aeFiredEvent {    int fd;    int mask;} aeFiredEvent;
就緒事件數目組的下標就是自己維護的fd,同時通過這個fd,也可以找到對應的回調,這裡先貼上這部分代碼

aeFileEvent *fe = &eventLoop->events[eventLoop->fired[j].fd];fe->rfileProc(eventLoop,fd,fe->clientData,mask);fe->wfileProc(eventLoop,fd,fe->clientData,mask);
4.時間事件

這裡目前我覺得redis的實現不夠完美,當然文檔的注釋中也提到了,使用鏈表去維護時間事件,尋找的複雜度就會0(n),聽別人說可以用小根堆實現,目前就簡單分析一下

/* Time event structure */typedef struct aeTimeEvent {    long long id; /* time event identifier. */    long when_sec; /* seconds */    long when_ms; /* milliseconds */    aeTimeProc *timeProc;    aeEventFinalizerProc *finalizerProc;    void *clientData;    struct aeTimeEvent *next;} aeTimeEvent;

資料結構分析完了,接下來就看它的實現了,主迴圈部分:

void aeMain(aeEventLoop *eventLoop) {    eventLoop->stop = 0;    while (!eventLoop->stop) {        if (eventLoop->beforesleep != NULL)            eventLoop->beforesleep(eventLoop);        aeProcessEvents(eventLoop, AE_ALL_EVENTS);    }}
邏輯都在aeProcessEvents:

int aeProcessEvents(aeEventLoop *eventLoop, int flags){    int processed = 0, numevents;    /* Nothing to do? return ASAP */    if (!(flags & AE_TIME_EVENTS) && !(flags & AE_FILE_EVENTS)) return 0;    /* Note that we want call select() even if there are no     * file events to process as long as we want to process time     * events, in order to sleep until the next time event is ready     * to fire. */    if (eventLoop->maxfd != -1 ||        ((flags & AE_TIME_EVENTS) && !(flags & AE_DONT_WAIT))) {        int j;        aeTimeEvent *shortest = NULL;        struct timeval tv, *tvp;        if (flags & AE_TIME_EVENTS && !(flags & AE_DONT_WAIT))            shortest = aeSearchNearestTimer(eventLoop);        if (shortest) {            long now_sec, now_ms;            /* Calculate the time missing for the nearest             * timer to fire. */            aeGetTime(&now_sec, &now_ms);            tvp = &tv;            tvp->tv_sec = shortest->when_sec - now_sec;            if (shortest->when_ms < now_ms) {                tvp->tv_usec = ((shortest->when_ms+1000) - now_ms)*1000;                tvp->tv_sec --;            } else {                tvp->tv_usec = (shortest->when_ms - now_ms)*1000;            }            if (tvp->tv_sec < 0) tvp->tv_sec = 0;            if (tvp->tv_usec < 0) tvp->tv_usec = 0;        } else {            /* If we have to check for events but need to return             * ASAP because of AE_DONT_WAIT we need to set the timeout             * to zero */            if (flags & AE_DONT_WAIT) {                tv.tv_sec = tv.tv_usec = 0;                tvp = &tv;            } else {                /* Otherwise we can block */                tvp = NULL; /* wait forever */            }        }        numevents = aeApiPoll(eventLoop, tvp);        for (j = 0; j < numevents; j++) {            aeFileEvent *fe = &eventLoop->events[eventLoop->fired[j].fd];            int mask = eventLoop->fired[j].mask;            int fd = eventLoop->fired[j].fd;            int rfired = 0;    /* note the fe->mask & mask & ... code: maybe an already processed             * event removed an element that fired and we still didn't             * processed, so we check if the event is still valid. */            if (fe->mask & mask & AE_READABLE) {                rfired = 1;                fe->rfileProc(eventLoop,fd,fe->clientData,mask);            }            if (fe->mask & mask & AE_WRITABLE) {                if (!rfired || fe->wfileProc != fe->rfileProc)                    fe->wfileProc(eventLoop,fd,fe->clientData,mask);            }            processed++;        }    }    /* Check time events */    if (flags & AE_TIME_EVENTS)        processed += processTimeEvents(eventLoop);    return processed; /* return the number of processed file/time events */}

代碼有點長,其實抽象出來就三個步驟:

1根據flag擷取epoll_wait等待的時間,有這樣幾種情況,

如果有時間事件,那麼就從事件事件中找最快逾時的時間,並等待這個時間,這個策略很巧妙

如果設定為不等待,那麼就立馬返回

如果設定為其它標誌,就永久阻塞直到觸發事件

2.等到事件發生,並根據回調處理事件

3.處理時間事件

其中aeApiPoll的實現也就是封裝了epoll_wait

static int aeApiPoll(aeEventLoop *eventLoop, struct timeval *tvp) {    aeApiState *state = eventLoop->apidata;    int retval, numevents = 0;    retval = epoll_wait(state->epfd,state->events,eventLoop->setsize,            tvp ? (tvp->tv_sec*1000 + tvp->tv_usec/1000) : -1);    if (retval > 0) {        int j;        numevents = retval;        for (j = 0; j < numevents; j++) {            int mask = 0;            struct epoll_event *e = state->events+j;            if (e->events & EPOLLIN) mask |= AE_READABLE;            if (e->events & EPOLLOUT) mask |= AE_WRITABLE;            if (e->events & EPOLLERR) mask |= AE_WRITABLE;            if (e->events & EPOLLHUP) mask |= AE_WRITABLE;            eventLoop->fired[j].fd = e->data.fd;            eventLoop->fired[j].mask = mask;        }    }    return numevents;}

既然這裡有封裝epoll_wait,必然想去看看epoll_ctl和epoll_create的封裝了,如下封裝了建立Epoll控制代碼

static int aeApiCreate(aeEventLoop *eventLoop) {    aeApiState *state = zmalloc(sizeof(aeApiState));    if (!state) return -1;    state->events = zmalloc(sizeof(struct epoll_event)*eventLoop->setsize);    if (!state->events) {        zfree(state);        return -1;    }    state->epfd = epoll_create(1024); /* 1024 is just a hint for the kernel */    if (state->epfd == -1) {        zfree(state->events);        zfree(state);        return -1;    }    eventLoop->apidata = state;    return 0;}
而epoll_ctl的封裝就是系統需要暴露給外界的介面,即建立檔案事件和時間事件的介面

例如:建立一個檔案事件

int aeCreateFileEvent(aeEventLoop *eventLoop, int fd, int mask,        aeFileProc *proc, void *clientData){    if (fd >= eventLoop->setsize) {        errno = ERANGE;        return AE_ERR;    }    aeFileEvent *fe = &eventLoop->events[fd];    if (aeApiAddEvent(eventLoop, fd, mask) == -1)        return AE_ERR;    fe->mask |= mask;    if (mask & AE_READABLE) fe->rfileProc = proc;    if (mask & AE_WRITABLE) fe->wfileProc = proc;    fe->clientData = clientData;    if (fd > eventLoop->maxfd)        eventLoop->maxfd = fd;    return AE_OK;}

這裡首先判斷fd的值,前面有說過,然後取出fe,根據mask設定fe的值,並且調用aeApiEvent,裡面才是調用了epoll_ctl

最後還要記得擦屁股,可能要修改一下maxfd

static int aeApiAddEvent(aeEventLoop *eventLoop, int fd, int mask) {    aeApiState *state = eventLoop->apidata;    struct epoll_event ee;    /* If the fd was already monitored for some event, we need a MOD     * operation. Otherwise we need an ADD operation. */    int op = eventLoop->events[fd].mask == AE_NONE ?            EPOLL_CTL_ADD : EPOLL_CTL_MOD;    ee.events = 0;    mask |= eventLoop->events[fd].mask; /* Merge old events */    if (mask & AE_READABLE) ee.events |= EPOLLIN;    if (mask & AE_WRITABLE) ee.events |= EPOLLOUT;    ee.data.u64 = 0; /* avoid valgrind warning */    ee.data.fd = fd;    if (epoll_ctl(state->epfd,op,fd,&ee) == -1) return -1;    return 0;}
首先判斷fd是否已近註冊,沒有就增加,有就需要修改,然後進行註冊,這裡底層介面的調用都是在直接上層,即ae_epoll進行處理的,上上層,即ae層只是對本層資料結構的維護,這種代碼邏輯很嚴密,可見作者水平

再比如:刪除一個檔案事件

void aeDeleteFileEvent(aeEventLoop *eventLoop, int fd, int mask){    if (fd >= eventLoop->setsize) return;    aeFileEvent *fe = &eventLoop->events[fd];    if (fe->mask == AE_NONE) return;    aeApiDelEvent(eventLoop, fd, mask);    fe->mask = fe->mask & (~mask);    if (fd == eventLoop->maxfd && fe->mask == AE_NONE) {        /* Update the max fd */        int j;        for (j = eventLoop->maxfd-1; j >= 0; j--)            if (eventLoop->events[j].mask != AE_NONE) break;        eventLoop->maxfd = j;    }}
首先判斷fd是否未註冊,未註冊就直接返回了,註冊了就刪除fd上的事件,然後對fe進行處理,最後也有可能要修改maxfd的值

至於aeApiDelEvent的實現

static void aeApiDelEvent(aeEventLoop *eventLoop, int fd, int delmask) {    aeApiState *state = eventLoop->apidata;    struct epoll_event ee;    int mask = eventLoop->events[fd].mask & (~delmask);    ee.events = 0;    if (mask & AE_READABLE) ee.events |= EPOLLIN;    if (mask & AE_WRITABLE) ee.events |= EPOLLOUT;    ee.data.u64 = 0; /* avoid valgrind warning */    ee.data.fd = fd;    if (mask != AE_NONE) {        epoll_ctl(state->epfd,EPOLL_CTL_MOD,fd,&ee);    } else {        /* Note, Kernel < 2.6.9 requires a non null event pointer even for         * EPOLL_CTL_DEL. */        epoll_ctl(state->epfd,EPOLL_CTL_DEL,fd,&ee);    }}

可以看到,只是對fd進行修改註冊或者刪除上面的事件


處理時間事件

static int processTimeEvents(aeEventLoop *eventLoop) {    int processed = 0;    aeTimeEvent *te;    long long maxId;    time_t now = time(NULL);    if (now < eventLoop->lastTime) {        te = eventLoop->timeEventHead;        while(te) {            te->when_sec = 0;            te = te->next;        }    }    eventLoop->lastTime = now;    te = eventLoop->timeEventHead;    maxId = eventLoop->timeEventNextId-1;    while(te) {        long now_sec, now_ms;        long long id;        if (te->id > maxId) {            te = te->next;            continue;        }        aeGetTime(&now_sec, &now_ms);        if (now_sec > te->when_sec ||            (now_sec == te->when_sec && now_ms >= te->when_ms))        {            int retval;            id = te->id;            retval = te->timeProc(eventLoop, id, te->clientData);            processed++;               if (retval != AE_NOMORE) {                aeAddMillisecondsToNow(retval,&te->when_sec,&te->when_ms);            } else {                aeDeleteTimeEvent(eventLoop, id);            }            te = eventLoop->timeEventHead;        } else {            te = te->next;        }    }    return processed;}

其實就是搜尋,然後處理,這裡處理的時候把這個節點刪除了,在aeDeleteTimeEvent中如下

int aeDeleteTimeEvent(aeEventLoop *eventLoop, long long id){    aeTimeEvent *te, *prev = NULL;    te = eventLoop->timeEventHead;    while(te) {        if (te->id == id) {            if (prev == NULL)                eventLoop->timeEventHead = te->next;            else                prev->next = te->next;            if (te->finalizerProc)                te->finalizerProc(eventLoop, te->clientData);            zfree(te);            return AE_OK;        }        prev = te;        te = te->next;    }    return AE_ERR; /* NO event with the specified ID found */}
目前就分析到這裡,帶有幾個問題後面再去閱讀源碼:

beforesleep到底幹了什嗎?

真箇程式的流程,包括網路連接那部分又是如何組織到Epoll中的?







Redis源碼-事件庫

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.