(轉)進程調度函數schedule()解讀

最後更新：2015-04-01 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

標籤：

原文

在linux系統中，單一處理器也是多執行緒訊號、事件等。這就需要一個核心演算法來進行進程調度。這個演算法就是CFS（Completely Fair Scheduler）。在 LInux Kernel Development 一書中用一句話總結CFS進程調度：

運行rbtree樹中最左邊葉子節點所代表的那個進程。

在一個自平衡二叉搜尋樹紅/黑樹狀結構rbtree的樹節點中，儲存了下一個應該運行進程的資料。在這裡我們看到了二叉搜尋樹的完美運用。具體可參見Introduction to Algorithms Page 174~182。

而進程調度的主要入口函數就是schedule()。它定義在檔案kernel/sched.c中。

我們先看一個在等待隊列中進行進程調度的例子：

DEFINE_WAIT(wait); //申明等待隊列    add_wait_queue(q,&wait); //把我們用的q隊列加入到wait等待隊列中    while(!condition){ //當等待事件沒有來臨時         prepare_to_wait(&q,&wait,TASK_INTERRUPTIBLE);         //將q從TASK_RUNNING或者其他狀態置為TASK_INTERRUPTIBLE不可啟動並執行休眠狀態。         //同時接受訊號&&事件來喚醒它         if(signal_pending(current))  //如果有來自從處理器的訊號         { processingsignal();}//處理訊號         schedule(); //調用紅/黑樹狀結構中的下一個進程    }    finish_wait(&q,&wait); //將進程設定為TASK_RUNNING並移出等待隊列.

其實我們可以這麼理解這段代碼。現在有一個任務要等待事件到來才能運行，怎麼實現呢？就是阻塞加查詢。但是這樣會使得這段代碼獨佔整個作業系統。為瞭解決這個問題，就在阻塞查詢之中加入了隊列和進程調度schedule()，從而不耽誤其它線程的執行。

再來看一看schedule()函數的結構：

schedule()函數結構

asmlinkage void __sched schedule(void)  ///定義通過堆棧傳值    {    struct task_struct *prev, *next;    unsigned long *switch_count;    struct rq *rq;    int cpu;    /*At the end of this function, it will check if need_resched() return    true, if that indeed happen, then goto here.*/    need_resched:    /*current process won‘t be preempted after call preemept_disable()*/    preempt_disable(); //不讓優先佔有當前進程    cpu = smp_processor_id();    rq = cpu_rq(cpu);    /* rcu_sched_qs ? */    rcu_sched_qs(cpu);    /* prev point to current task_struct */    prev = rq->curr;    /* get current task_struct‘s context switch count */    switch_count = &prev->nivcsw;    /* kernel_flag is "the big kernel lock".      * This spinlock is taken and released recursively by lock_kernel()     * and unlock_kernel(). It is transparently dropped and reacquired     * over schedule(). It is used to protect legacy code that hasn‘t     * been migrated to a proper locking design yet.     * In task_struct, there is a member lock_depth, which is inited -1,     * indicates that the current task have no kernel lock.     * When lock_depth >=0 indicate that it own kernel lock.     * During context switching, it is not permitted that the task       * switched away remain own kernel lock , so in scedule(),it     * call release_kernel_lock(), release kernel lock.     */    release_kernel_lock(prev);    need_resched_nonpreemptible:    schedule_debug(prev);    if (sched_feat(HRTICK))        hrtick_clear(rq);    /* occupy current rq‘s lock */    raw_spin_lock_irq(&rq->lock); //佔有rq自旋鎖    /* update rq‘s clock,this function will call sched_clock_cpu() */    update_rq_clock(rq);    /* clear bit in task_struct‘s thread_struct‘s flag TIF_NEED_RESCHED.     * In case that it will be rescheduled, because it prepare to give     * up cpu.     */    clear_tsk_need_resched(prev);    if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {        if (unlikely(signal_pending_state(prev->state, prev)))            prev->state = TASK_RUNNING;        else            deactivate_task(rq, prev, 1);        switch_count = &prev->nvcsw;    }    /* For none-SMP, pre_schedule is NULL */    pre_schedule(rq, prev);      if (unlikely(!rq->nr_running))        idle_balance(cpu, rq);    put_prev_task(rq, prev);    next = pick_next_task(rq);    if (likely(prev != next)) {        sched_info_switch(prev, next);        perf_event_task_sched_out(prev, next);        rq->nr_switches++;        rq->curr = next;        ++*switch_count;        context_switch(rq, prev, next); /* unlocks the rq */        /*         * the context switch might have flipped the stack from under         * us, hence refresh the local variables.         */        cpu = smp_processor_id();        rq = cpu_rq(cpu);    } else      raw_spin_unlock_irq(&rq->lock);//current task still occupy cpu    post_schedule(rq);    if (unlikely(reacquire_kernel_lock(current) < 0)) {        prev = rq->curr;        switch_count = &prev->nivcsw;        goto need_resched_nonpreemptible;    }    preempt_enable_no_resched();    if (need_resched())        goto need_resched;    }    EXPORT_SYMBOL(schedule);

schedule()函數的目的在於用另一個進程替換當前正在啟動並執行進程。因此，這個函數的主要結果就是設定一個名為next的變數，以便它指向所選中的代替current的進程的描述符。如果在系統中沒有可運行進程的優先順序大於current的優先順序，那麼，結果是next與current一致，沒有進程切換髮生。

References

[1].UNDERSTANDING THE LINUX KERNEL. Page 276

[2].Linux Kernel Development. Page 52

[3].http://hi.baidu.com/zengzhaonong/item/20d9e8207b04cb8f6e2cc323

(轉)進程調度函數schedule()解讀

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

(轉)進程調度函數schedule()解讀

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support