(轉)進程調度函數schedule()解讀

來源:互聯網
上載者:User

標籤:

原文

在linux系統中,單一處理器也是多執行緒訊號、事件等。這就需要一個核心演算法來進行進程調度。這個演算法就是CFS(Completely Fair Scheduler)。在 LInux Kernel Development 一書中用一句話總結CFS進程調度:

運行rbtree樹中最左邊葉子節點所代表的那個進程。

在一個自平衡二叉搜尋樹紅/黑樹狀結構rbtree的樹節點中,儲存了下一個應該運行進程的資料。在這裡我們看到了二叉搜尋樹的完美運用。具體可參見Introduction to Algorithms Page 174~182。

而進程調度的主要入口函數就是schedule()。它定義在檔案kernel/sched.c中。

我們先看一個在等待隊列中進行進程調度的例子:

DEFINE_WAIT(wait); //申明等待隊列    add_wait_queue(q,&wait); //把我們用的q隊列加入到wait等待隊列中    while(!condition){ //當等待事件沒有來臨時         prepare_to_wait(&q,&wait,TASK_INTERRUPTIBLE);         //將q從TASK_RUNNING或者其他狀態置為TASK_INTERRUPTIBLE不可啟動並執行休眠狀態。         //同時接受訊號&&事件來喚醒它         if(signal_pending(current))  //如果有來自從處理器的訊號         { processingsignal();}//處理訊號         schedule(); //調用紅/黑樹狀結構中的下一個進程    }    finish_wait(&q,&wait); //將進程設定為TASK_RUNNING並移出等待隊列.

其實我們可以這麼理解這段代碼。現在有一個任務要等待事件到來才能運行,怎麼實現呢?就是阻塞加查詢。但是這樣會使得這段代碼獨佔整個作業系統。為瞭解決這個問題,就在阻塞查詢之中加入了隊列和進程調度schedule(),從而不耽誤其它線程的執行。

再來看一看schedule()函數的結構:

schedule()函數結構
asmlinkage void __sched schedule(void)  ///定義通過堆棧傳值    {    struct task_struct *prev, *next;    unsigned long *switch_count;    struct rq *rq;    int cpu;    /*At the end of this function, it will check if need_resched() return    true, if that indeed happen, then goto here.*/    need_resched:    /*current process won‘t be preempted after call preemept_disable()*/    preempt_disable(); //不讓優先佔有當前進程    cpu = smp_processor_id();    rq = cpu_rq(cpu);    /* rcu_sched_qs ? */    rcu_sched_qs(cpu);    /* prev point to current task_struct */    prev = rq->curr;    /* get current task_struct‘s context switch count */    switch_count = &prev->nivcsw;    /* kernel_flag is "the big kernel lock".      * This spinlock is taken and released recursively by lock_kernel()     * and unlock_kernel(). It is transparently dropped and reacquired     * over schedule(). It is used to protect legacy code that hasn‘t     * been migrated to a proper locking design yet.     * In task_struct, there is a member lock_depth, which is inited -1,     * indicates that the current task have no kernel lock.     * When lock_depth >=0 indicate that it own kernel lock.     * During context switching, it is not permitted that the task       * switched away remain own kernel lock , so in scedule(),it     * call release_kernel_lock(), release kernel lock.     */    release_kernel_lock(prev);    need_resched_nonpreemptible:    schedule_debug(prev);    if (sched_feat(HRTICK))        hrtick_clear(rq);    /* occupy current rq‘s lock */    raw_spin_lock_irq(&rq->lock); //佔有rq自旋鎖    /* update rq‘s clock,this function will call sched_clock_cpu() */    update_rq_clock(rq);    /* clear bit in task_struct‘s thread_struct‘s flag TIF_NEED_RESCHED.     * In case that it will be rescheduled, because it prepare to give     * up cpu.     */    clear_tsk_need_resched(prev);    if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {        if (unlikely(signal_pending_state(prev->state, prev)))            prev->state = TASK_RUNNING;        else            deactivate_task(rq, prev, 1);        switch_count = &prev->nvcsw;    }    /* For none-SMP, pre_schedule is NULL */    pre_schedule(rq, prev);      if (unlikely(!rq->nr_running))        idle_balance(cpu, rq);    put_prev_task(rq, prev);    next = pick_next_task(rq);    if (likely(prev != next)) {        sched_info_switch(prev, next);        perf_event_task_sched_out(prev, next);        rq->nr_switches++;        rq->curr = next;        ++*switch_count;        context_switch(rq, prev, next); /* unlocks the rq */        /*         * the context switch might have flipped the stack from under         * us, hence refresh the local variables.         */        cpu = smp_processor_id();        rq = cpu_rq(cpu);    } else      raw_spin_unlock_irq(&rq->lock);//current task still occupy cpu    post_schedule(rq);    if (unlikely(reacquire_kernel_lock(current) < 0)) {        prev = rq->curr;        switch_count = &prev->nivcsw;        goto need_resched_nonpreemptible;    }    preempt_enable_no_resched();    if (need_resched())        goto need_resched;    }    EXPORT_SYMBOL(schedule);

schedule()函數的目的在於用另一個進程替換當前正在啟動並執行進程。因此,這個函數的主要結果就是設定一個名為next的變數,以便它指向所選中的 代替current的進程的描述符。如果在系統中沒有可運行進程的優先順序大於current的優先順序,那麼,結果是next與current一致,沒有進程切換髮生。

References

[1].UNDERSTANDING THE LINUX KERNEL. Page 276

[2].Linux Kernel Development. Page 52

[3].http://hi.baidu.com/zengzhaonong/item/20d9e8207b04cb8f6e2cc323

(轉)進程調度函數schedule()解讀

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.