0.033 seconds of art --- for vs. foreach
For personal use only, do not reprint, do not use for any commercial purposes.
For a long time, the debate on the advantages and disadvantages of for and foreach in C # seems to have never stopped, and the rumors and truth are mixed, so that this seems to be a very complicated problem. Fans of for believes that foreach will allocate enumerator internally to generate garbage and affect the performance. foreach's supporters believe that the compiler will perform special optimization on foreach, it is also safer (this is what efective C # writes ). So is that true?
In fact, it is the biggest rumor that foreach will allocate enumerator!Cornflower blueThe experiment concluded that:
1. collection <t>YesAllocate enumerator
2. For most types in the system. Collections. generic spaceNoAllocate enumerator, such as list, queues, and linked lists.
3. If the type in 2 is converted to the interface using foreachYesAllocate enumerator, such as foreach item in (ienumerable) List
If you have doubts about this, you can use clrprofiler for a simple test. foreach is not as bad as we think.
In addition to the rumors about memory usage, what is the performance? Which is faster. Let's start testing. Of course, we should first understand that the performance of for and foreach is naturally different for different types. For the convenience of discussion, we will only use array and list for comparison below. The test framework is as follows:
Code
Static Void Main (string [] ARGs)
{
Int Maxcount = 10000 ;
List < Int > List = New List < Int > (Maxcount );
Int [] Arr = New Int [Maxcount];
For ( Int I = 0 ; I < Maxcount; I ++ )
{
Arr [I] = I;
List. Add (I );
}
Int Result = 0 ;
Stopwatch Timer = New Stopwatch ();
GC. Collect ();
GC. waitforpendingfinalizers ();
GC. Collect ();
GC. waitforpendingfinalizers ();
Timer. Start ();
For ( Int J = 0 ; J < 1000 ; J ++ )
{
// Run test case here
}
Timer. Stop ();
Console. writeline (timer. elapsedmilliseconds );
}
First, compare the array-based performance:
Test Case 1:
For (INT I = 0; I <arr. lenght; I ++)
Result = arr [I];
Test Case 2:
Foreach (int I in ARR)
Result = I;
The test results are almost the same. The for version is slightly faster than the foreach version. Foreach's dead party may be a bit unconvinced here and think it is the result of a test error. But next we will analyze why such a result appears and use ildasm to decompile it.CodeThe following il code is generated for the for version:
-
- Il_005c: LDC. i4.0
-
- Il_005d: stloc. s v_7 // initialize the cyclic variable
- Il_005f: Br. s il_006d // branch jump
-
- Il_0061: ldloc.2 // press the array into the evaluation Stack
-
- Il_0062: ldloc. s v_7 // current index
-
- Il_0064: ldelem. I4 // gets the current element
-
- Il_0065: stloc. s result // put the result
-
- Il_0067: ldloc. s v_7
-
- Il_0069: LDC. i4.1
- Il_006a: Add // update Index
-
- Il_006b: stloc. s v_7
-
- Il_006d: ldloc. s v_7
-
- Il_006f: ldloc.2
-
- Il_0070: ldlen // press the number of array elements into the evaluation stack.
-
- Il_0071: Conv. I4
-
- Il_0072: BLT. s il_0061 // test whether the loop continues
Foreach version:
-
- Il_005f: LDC. i4.0
- Rochelle 0060: stloc. s CS $7 $0001
-
- L_0062: Br. s il_0075
-
- L_0064: ldloc. s CS $6 $0000
-
- Rochelle 0066: ldloc. s CS $7 $0001
-
- L_0068: ldelem. I4
-
- L_0069: stloc. s v_7
-
- L_006b: ldloc. s v_7
-
- L_006d: stloc. s result
-
- L_006f: ldloc. s CS $7 $0001
-
- L_0071: LDC. i4.1
-
- L_0072: add
-
- Rochelle 0073: stloc. s CS $7 $0001
- Rochelle 0075: ldloc. s CS $7 $0001
-
- L_0077: ldloc. s CS $6 $0000
-
- L_0079: ldlen
-
- L_007a: Conv. I4
-
- L_007b: BLT. s il_0064
-
We can see that the code is very similar. Another rumor is that the compiler has not made any special optimization for foreach. The foreach version has two more commands than, A total of 18 commands. These two commands make foreach a little slower than. I think it is very strange for the foreach commands. The code first copies the value retrieved from the array to the Temporary Variable v_7, and then assigns the result, for does not have this step. At first, I thought that the. NET 2.0 optimization was not good enough. After I changed it to 3.5 SP1, I still had the same code. I am not a compiler or an il expert. I really don't understand why. Note that for complex value types, such as matrix, these two additional commands will make the situation worse, because the value type is always passed by value, and each time a value is assigned to matrix, you must perform operations on at least 16 int types. In actual tests, after int is changed to matrix, foreach is three times slower than for. This result should be unexpected! Of course, there is no such problem for the reference type.
Next, we will test the list:
Test Case 3:
For (INT I = 0; I <list. Count; I ++)
Result = list [I];
Test Case 4:
Foreach (int I in List)
Result = I;
Compared with a simple array, the code generated by foreach for list is quite different. You can even see advanced commands such as try and catch:
-
- .Try
-
- {
- Il_0064: Br. s il_0073
-
- Il_0066: ldloca. s CS $5 $0000
-
- Il_0068: Call instance! 0 valuetype [mscorlib] system. Collections. Generic. List '1/Enumerator <int32 >:: get_current ()
-
- Il_006d: stloc. s v_7
-
- Il_006f: ldloc. s v_7
-
- Il_0071: stloc. s result
-
- Il_0073: ldloca. s CS $5 $0000
- Il_0075: Call instanceBoolValuetype [mscorlib] system. Collections. Generic. List '1/Enumerator <int32>: movenext ()
-
- Il_007a: brtrue. s il_0066
-
- Il_007c: Leave. s il_008c
-
- }// End. Try
-
- Finally
-
- {
-
- Il_007e: ldloca. s CS $5 $0000
-
- Il_0080: Constrained. valuetype [mscorlib] system. Collections. Generic. List '1/Enumerator <int32>
- Il_0086: callvirt instanceVoid[Mscorlib] system. idisposable: dispose ()
-
- Il_008b: endfinally
-
- }// End Handler
-
For list, foreach performs iteration through enumerator instead of based on simple indexes. Well, you might say that at first, didn't you say that list won't allocate enumerator? Indeed, although enumerator is used here, no memory is allocated to create enumerator. If Ms optimizes foreach, it should be the value here. Note that the result is still assigned an additional value here.
For version:
-
- Il_005c: LDC. i4.0
-
- Il_005d: stloc. s v_7
-
- Il_005f: Br. s il_0071
-
- Il_0061: ldloc.1
- Il_0062: ldloc. s v_7
-
- Il_0064: callvirt instance! 0Class[Mscorlib] system. Collections. Generic. List '1 <int32 >:: get_item (int32)
-
- Il_0069: stloc. s result
-
- Il_006b: ldloc. s v_7
-
- Il_006d: LDC. i4.1
-
- Il_006e: add
-
- Il_006f: stloc. s v_7
-
- Il_0071: ldloc. s v_7
-
- Il_0073: ldloc.1
-
- Il_0074: callvirt instance int32Class[Mscorlib] system. Collections. Generic. List '1 <int32 >:: get_count ()
- Il_0079: BLT. s il_0061
The for version of list is similar to that of array. From the code Il, we can see that the for version is much simpler than foreach, and of course foreach has better security. In actual tests, for is about twice faster than foreach. In addition, it is noted that the for version of IL code calls a getcount () to obtain the actual length of the list during each iteration. Unlike array. length, list. Count corresponds to a virtual method, so it is not inline during loop initialization. If you think this is redundant, you can change Test 3:
Of course, this also adds youProgramThe possibility of an error.
Test Case 5:
Int COUNT = List. count;
For (INT I = 0; I <count; I ++)
Result = list [I];
Finally, the final test results also show that for, the array-based loop is about 5 times faster than list :)