As multi-core computers become more and more common, the annoying thing is that loops cannot use these redundant kernel resources. Fortunately, Microsoft's handsome guy and girl solve this confusion for us. They have taken measures to allow us to use these extra kernel resources cyclically. I will showCodeFor example, a new function in. Net 4.0 is called the task parallel library. With this library, it is very easy to write a hosting language to use multiple kernels. In this way, we can write code for parallel tasks, which can run on available processors at the same time. In general, this can significantly speed up code execution.
Here is a parallel sample code:
Using System; Namespace Parallelforsample { Public Class Singlecore { Public Static Void Calculate ( Int Calcval) {utility util = New Utility (); util. Start (); Int [,] G = New Int [Calcval, calcval]; For ( Int K = 0; k <calcval; k ++) For ( Int I = 0; I <calcval; I ++) For ( Int J = 0; j <calcval; j ++) g [I, j] = math. min (G [I, j], G [I, K] + G [K, J]); util. stop ();}}}
As you can see, this is a fairly simple class-it uses three loops to fill the array. Now, use the TPL (task parallel Library) to rewrite this Code:
Using System; Using System. Threading. tasks; Namespace Parallelforsample { Public Class Multicore { Public Static Void Calculate ( Int Calcval) {utility util = New Utility (); util. Start (); Int [,] G = New Int [Calcval, calcval]; Parallel. For (0, calcval, Delegate ( Int K) {parallel. For (0, calcval, Delegate ( Int I ){ For ( Int J = 0; j <calcval; j ++) g [I, j] = math. min (G [I, j], G [I, K] + G [K, J]) ;}); util. stop () ;}}as you can see, this syntax is slightly different. The for loop is decomposed into parallel.. This requires a delegate. In fact, the row that executes the loop is for this delegate. If you download and run this example, you can observe different behaviors and different cycle times.
However, you need a multi-core machine to see this gap. If it is a single-core machine, you cannot observe it.
The following is the Translator's note:
My computer configuration is as follows:
Running result:
We can see that the gap is close to 10 seconds.
Code:/files/zhuqil/parallelforsample.zip
Original article: Faster-faster-Loops