標籤:
原文
在linux系統中,單一處理器也是多執行緒訊號、事件等。這就需要一個核心演算法來進行進程調度。這個演算法就是CFS(Completely Fair Scheduler)。在 LInux Kernel Development 一書中用一句話總結CFS進程調度:
運行rbtree樹中最左邊葉子節點所代表的那個進程。
在一個自平衡二叉搜尋樹紅/黑樹狀結構rbtree的樹節點中,儲存了下一個應該運行進程的資料。在這裡我們看到了二叉搜尋樹的完美運用。具體可參見Introduction to Algorithms Page 174~182。
而進程調度的主要入口函數就是schedule()。它定義在檔案kernel/sched.c中。
我們先看一個在等待隊列中進行進程調度的例子:
DEFINE_WAIT(wait); //申明等待隊列 add_wait_queue(q,&wait); //把我們用的q隊列加入到wait等待隊列中 while(!condition){ //當等待事件沒有來臨時 prepare_to_wait(&q,&wait,TASK_INTERRUPTIBLE); //將q從TASK_RUNNING或者其他狀態置為TASK_INTERRUPTIBLE不可啟動並執行休眠狀態。 //同時接受訊號&&事件來喚醒它 if(signal_pending(current)) //如果有來自從處理器的訊號 { processingsignal();}//處理訊號 schedule(); //調用紅/黑樹狀結構中的下一個進程 } finish_wait(&q,&wait); //將進程設定為TASK_RUNNING並移出等待隊列.
其實我們可以這麼理解這段代碼。現在有一個任務要等待事件到來才能運行,怎麼實現呢?就是阻塞加查詢。但是這樣會使得這段代碼獨佔整個作業系統。為瞭解決這個問題,就在阻塞查詢之中加入了隊列和進程調度schedule(),從而不耽誤其它線程的執行。
再來看一看schedule()函數的結構:
schedule()函數結構
asmlinkage void __sched schedule(void) ///定義通過堆棧傳值 { struct task_struct *prev, *next; unsigned long *switch_count; struct rq *rq; int cpu; /*At the end of this function, it will check if need_resched() return true, if that indeed happen, then goto here.*/ need_resched: /*current process won‘t be preempted after call preemept_disable()*/ preempt_disable(); //不讓優先佔有當前進程 cpu = smp_processor_id(); rq = cpu_rq(cpu); /* rcu_sched_qs ? */ rcu_sched_qs(cpu); /* prev point to current task_struct */ prev = rq->curr; /* get current task_struct‘s context switch count */ switch_count = &prev->nivcsw; /* kernel_flag is "the big kernel lock". * This spinlock is taken and released recursively by lock_kernel() * and unlock_kernel(). It is transparently dropped and reacquired * over schedule(). It is used to protect legacy code that hasn‘t * been migrated to a proper locking design yet. * In task_struct, there is a member lock_depth, which is inited -1, * indicates that the current task have no kernel lock. * When lock_depth >=0 indicate that it own kernel lock. * During context switching, it is not permitted that the task * switched away remain own kernel lock , so in scedule(),it * call release_kernel_lock(), release kernel lock. */ release_kernel_lock(prev); need_resched_nonpreemptible: schedule_debug(prev); if (sched_feat(HRTICK)) hrtick_clear(rq); /* occupy current rq‘s lock */ raw_spin_lock_irq(&rq->lock); //佔有rq自旋鎖 /* update rq‘s clock,this function will call sched_clock_cpu() */ update_rq_clock(rq); /* clear bit in task_struct‘s thread_struct‘s flag TIF_NEED_RESCHED. * In case that it will be rescheduled, because it prepare to give * up cpu. */ clear_tsk_need_resched(prev); if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) { if (unlikely(signal_pending_state(prev->state, prev))) prev->state = TASK_RUNNING; else deactivate_task(rq, prev, 1); switch_count = &prev->nvcsw; } /* For none-SMP, pre_schedule is NULL */ pre_schedule(rq, prev); if (unlikely(!rq->nr_running)) idle_balance(cpu, rq); put_prev_task(rq, prev); next = pick_next_task(rq); if (likely(prev != next)) { sched_info_switch(prev, next); perf_event_task_sched_out(prev, next); rq->nr_switches++; rq->curr = next; ++*switch_count; context_switch(rq, prev, next); /* unlocks the rq */ /* * the context switch might have flipped the stack from under * us, hence refresh the local variables. */ cpu = smp_processor_id(); rq = cpu_rq(cpu); } else raw_spin_unlock_irq(&rq->lock);//current task still occupy cpu post_schedule(rq); if (unlikely(reacquire_kernel_lock(current) < 0)) { prev = rq->curr; switch_count = &prev->nivcsw; goto need_resched_nonpreemptible; } preempt_enable_no_resched(); if (need_resched()) goto need_resched; } EXPORT_SYMBOL(schedule);
schedule()函數的目的在於用另一個進程替換當前正在啟動並執行進程。因此,這個函數的主要結果就是設定一個名為next的變數,以便它指向所選中的 代替current的進程的描述符。如果在系統中沒有可運行進程的優先順序大於current的優先順序,那麼,結果是next與current一致,沒有進程切換髮生。
References
[1].UNDERSTANDING THE LINUX KERNEL. Page 276
[2].Linux Kernel Development. Page 52
[3].http://hi.baidu.com/zengzhaonong/item/20d9e8207b04cb8f6e2cc323
(轉)進程調度函數schedule()解讀