The iterator pattern is an example of patterns of behavior in design patterns (behavioral pattern), a pattern that simplifies communication between objects and is a very easy to understand and use pattern. Simply put, the iterator pattern allows you to get all the elements in the sequence without worrying about whether the type is array,list,linked list or what sequence structure it is. This makes it very efficient to build data pipeline-that is, it can enter processing channels, perform a series of transformations, or filter, and then get the results. In fact, this is the core pattern of LINQ.
In. NET, the iterator pattern is encapsulated by the IEnumerator and IEnumerable and their corresponding generic interfaces. If a class implements the IEnumerable interface, it can be iterated; Calling the GetEnumerator method returns the implementation of the IEnumerator interface, which is the iterator itself. An iterator resembles a cursor in a database, which is a positional record in a data series. Iterators can only move forward, with multiple iterators in the same data sequence manipulating the data at the same time.
Support for iterators has been built into the c#1, and that is the foreach statement. Makes it possible to iterate over the collection more directly and simply than for the FOR Loop statement, the compiler calls the Foreach compiler to invoke the GetEnumerator and MoveNext methods and the current property, if the object implements the IDisposable interface, The iterator is released after the iteration is complete. But in c#1, implementing an iterator is a relatively cumbersome operation. C#2 makes this work much simpler and saves a lot of work on implementing iterators.
Next, let's look at how to implement an iterator and c#2 for the iterator implementation, and then enumerate the examples of several iterators in real life.
1. C#1: Manual implementation of the cumbersome iterator
Suppose we need to implement a new collection type based on the ring buffer. We will implement the IEnumerable interface, which makes it easy for users to take advantage of all the elements in the collection. We ignore other details and focus only on how to implement the iterator. The collection stores values in an array, and the collection sets the starting point of the iteration, for example, assuming that the set has 5 elements and you can set the starting point to 2, then the iteration output is 2,3,4,0 and the last is 1.
To be able to demonstrate simply, we provide a constructor for setting values and starting points. Allows us to iterate through the collection in the following way:
Object[] values = {"A", "B", "C", "D", "E"};iterationsample collection = new Iterationsample (values, 3); foreach (object X in collection) { Console.WriteLine (x);}
Since we set the starting point to 3, the result of the collection output is D,e,a,b and C, now let's look at how to implement an iterator for the Iterationsample class:
Class iterationsample:ienumerable{ object[] values; Int32 Startingpoint; Public iterationsample (object[] values, Int32 startingpoint) { this.values = values; This.startingpoint = Startingpoint; } Public IEnumerator GetEnumerator () { throw new NotImplementedException ();} }
We have not yet implemented the GetEnumerator method, but how to write the logic of the GetEnumerator part, the first is to place the current state of the cursor somewhere. On the one hand, the iterator pattern does not return all the data at once, but the client requests only one data at a time. This means that we want to record the record that the customer is currently requesting into the collection. The C#2 compiler has done a lot of work for the state of the iterator.
Now let's take a look at what states to save and where the state exists, imagine that we're trying to save the state in the Iterationsample collection, so that it implements the IEnumerator and IEnumerable methods. At first glance, it may seem, after all, that the data is in the right place, including the starting position. Our GetEnumerator method simply returns this. However, this approach has a very important problem, if the GetEnumerator method is called multiple times, then multiple independent iterators will return. For example, we can use two nested foreach statements to get all possible value pairs. These two iterations need to be independent of each other. This means that the two iterator objects that we need to return each time we call GetEnumerator must remain independent. We can still do this directly in the Iterationsample class through the corresponding function. But our class has multiple responsibilities, and this one backs the principle of single responsibility.
So let's create another class to implement the iterator itself. We use the inner classes in C # to implement this logic. The code is as follows:
class iterationsampleenumerator:ienumerator{iterationsample parent;//Iteration Object #1 Int3 2 position;//The position of the current cursor #2 internal iterationsampleenumerator (iterationsample parent) {this.parent = parent; Position = -1;//array element subscript starting from 0, initially the default current cursor is set to-1, that is, before the first element, #3} public bool MoveNext () {if (Position! = Parent.values.Length)//Determine if the current position is the last, if not the cursor self-increment #4 {position++; } return position < Parent.values.Length; Public object Current {get {if (position = =-1 | | position = = parent.values.Length)//First A previous and last access illegal #5 {throw new InvalidOperationException (); Int32 index = position + parent.startingpoint;//Consider the situation where the custom start position #6 index = index% Parent.values.Len Gth return Parent.values[index]; }} public void Reset () {position = -1;//resets the cursor to-1 #7}}
To implement a simple iterator, you need to write so much code manually: you need to record the original collection of iterations # #, record the current cursor position # #, and return the element, set the position of the iterator in the array based on the current cursor and the starting position defined by the group. When initializing, the current position is set before the first element, # # #, and when the first call to the iterator needs to call MoveNext first, and then call the existing property. The condition of the current position is determined at the time of the cursor increment # #, so that no element can be returned if the first call to MoveNext is not an error. When we reset the iterator, we restore the position of the current cursor before the first element, # #.
The code above is straightforward, except that it is error-prone to return the correct value in combination with the current cursor position and the custom starting position. Now, just return the iteration class we wrote in the GetEnumerator method of the Iterationsample class:
Public IEnumerator GetEnumerator () { return to new Iterationsampleenumerator (this);}
It is important to note that the above is just a relatively simple example, there is not too much state to track, not to check whether the collection has changed during the iteration. In order to implement a simple iterator, we implemented so much code in c#1. It is convenient to use foreach when we implement a collection of IEnumerable interfaces using the framework's own, but we need to write so much code when we write our own collections to implement iterations.
In C#1, it takes about 40 lines of code to implement a simple iterator, and now look at C#2 's improvements to the process.
2. C#2: Simplifying iterations with yield statements
2.1 Introducing Iteration Blocks (iterator) and yield return statements
C#2 makes iterations easier-reducing the amount of code also makes the code more elegant. The following code shows the complete code for implementing the GetEnumerator method in the c#2:
Public IEnumerator GetEnumerator () {for (int index = 0; index < this.values.Length; index++) { yield retur N values[(index + startingpoint)% values. Length]; }}
Simply a few lines of code can fully implement the functionality required by the Iterationsampleiterator class. The method looks very common, except that the yield return is used. This statement tells the compiler that this is not an ordinary method, but rather an iterative block that needs to be executed (yield block), he returns a IEnumerator object, you can use an iteration block to perform an iterative method and return a type that IEnumerable needs to implement. IEnumerator or a corresponding generic type. If a non-generic version of an interface is implemented, the yield type of the iteration block is the object type, otherwise the corresponding generic type is returned. For example, if the method implements the Ienumerable<string> interface, the type of yield returned is the String type. The normal return statement is not allowed in the iteration block except yield return. All yield return statements in a block must return a type that is compatible with the last return type of the block. For example, if a method definition needs to return a ienumeratble<string> type, it cannot yield return 1. One thing to emphasize is that, for iterative blocks, although the method we write looks like it was executed sequentially, we actually let the compiler create a state machine for us. This is the part of the code that we write in c#1---the caller needs to return only one value at a time, so we need to remember the position in the collection when we last returned the value. When the compiler encounters an iteration block, it creates an inner class that implements the state machine. This class remembers the exact current position of our iterators as well as local variables, including parameters. This class is a bit like the code we wrote earlier, and he saves all the states that need to be recorded as instance variables. Here's a look at what the state machine needs to do sequentially in order to implement an iterator:
- It needs some initial state.
- When MoveNext is called, he needs to execute the code in the GetEnumerator method to prepare the next data to be returned.
- When the current property is called, you need to return the value of yielded.
- Need to know when the iteration ends Yes, MoveNext will return false
Let's look at the order in which iterators are executed.
2.2 Execution flow of iterators
The following code shows the execution flow of the iterator, the code output (0,1,2,-1), and then terminates.
Class program{static readonly string Padding = new String (', 30); Static ienumerable<int32> createenumerable () {Console.WriteLine ("{0} createenumerable () method Start", Padding); for (int i = 0; i < 3; i++) {Console.WriteLine ("{0} start yield {1}", i); yield return i; Console.WriteLine ("{0}yield end", Padding); } Console.WriteLine ("{0} yielding last value", Padding); Yield return-1; Console.WriteLine ("{0} createenumerable () method End", Padding); } static void Main (string[] args) {ienumerable<int32> iterable = createenumerable (); Ienumerator<int32> iterator = iterable. GetEnumerator (); Console.WriteLine ("Start Iteration"); while (true) {Console.WriteLine ("Call MoveNext method ..."); Boolean result = iterator. MoveNext (); Console.WriteLine ("{0} returned by the MoveNext method", result); if (!result) {break; } Console.WriteLine ("Get Current Value ..."); Console.WriteLine ("Gets the current value of {0}", iterator. Current); } console.readkey (); }}
To show the details of the iteration, the above code uses a while loop, which normally uses foreach. Unlike the last time, this time in the iterative method we return IEnumerable; object instead of IEnumerator; In general, in order to implement the IEnumerable interface, you only need to return the IEnumerator object, and if you want to return some columns of data from a method, use IEnumerable. The following is the output:
From the output, you can see the points:
- The method in movenext,createenumerable is called until the first call.
- At the time of calling MoveNext, all operations were done, and the current property was returned without executing any code.
- The code stops executing after yield return and waits for the next call to the MoveNext method.
- There can be multiple yield return statements in a method.
- After the last yield return execution completes, the code is not terminated. Calling MoveNext returns false to make the method end.
1th is especially important: this means that you cannot write any code in the iteration block that needs to be executed immediately at the time of the method invocation-for example, parameter validation. If you put the parameter validation in the iteration block, then he will not be able to work well, which is often the error place, and this error is not easy to find.
Here's how to stop the iteration and the special execution of the finally statement block.
2.3 Special execution flow of iterators
In a normal method, the return statement usually has two functions, one returning the result of the caller's execution. The second is to terminate the execution of the method and execute the method in the finally statement before terminating. In the example above, we see that the yield return statement only briefly exits the method and continues execution when the MoveNext is called again. Here we do not write the finally statement block. How the real exit method, when exiting the method, is how the finnally statement block executes, let's look at a relatively simple structure: yield break statement block.
End an iteration with yield break
What we usually do is to make the method have only one exit point, usually, multiple exit points of the program will make the code difficult to read, especially the use of a try catch finally and other statement blocks for resource cleanup and exception handling. You also encounter this problem when using iteration blocks, but if you want to exit the iteration earlier, you can use yield break to achieve the desired effect. He can immediately terminate the iteration so that the next call to MoveNext returns false.
The following code shows the iteration from 1 iterations to 100, but the time is timed out.
Static ienumerable<int32> Countwithtimelimit (DateTime limit) { try {for (int i = 1; i <=; i++)
{ if (datetime.now >= limit) { yield break; } yield return i; } } Finally { Console.WriteLine ("Stop Iteration! "); Console.readkey (); }} static void Main (string[] args) { DateTime stop = DateTime.Now.AddSeconds (2); foreach (Int32 i in Countwithtimelimit (stop)) { Console.WriteLine ("Return {0}", i); Thread.Sleep (+); }}
Is the output, you can see that the iteration statement terminates normally, the yield return statement is the same as the return statement in the normal method, and the following is a look at when and how the finally statement block is executed.
The execution of the finally statement block
Typically, a finally statement block executes when the method executes out of a specific area. The finally statement in the iteration block is not the same as the finally statement block in the normal method. As we can see, the yield return statement stops execution of the method, not the Exit method, in which case the statement in the finally statement block is not executed.
However, when the yield break statement is encountered, the finally statement block is executed, which is the same as return in the common method. It is common to use the finally statement in an iteration block to free resources, just as with using statements.
Let's see how the finally statement executes.
Whether it's iterating 100 times or because the time has stopped the iteration, or if it throws an exception, the finally statement always executes, but in some cases we don't want the finally statement block to be executed.
Only if the statements in the iteration block after the call to MoveNext are executed, what happens if the MoveNext is called several times MoveNext and then the call is stopped? Take a look at the following code?
DateTime stop = DateTime.Now.AddSeconds (2); foreach (Int32 i in Countwithtimelimit (stop)) { if (i > 3) { Console.WriteLine ("Back in ^"); return; } Thread.Sleep (300);}
In Forech, after the return statement, the code continues to execute the finally block in Countwithtimelimit because there is a finally block in the Countwithtimelimit. The foreach statement invokes the Dispose method of the iterator returned by GetEnumerator. When you call the Dispose method of an iterator that contains an iteration block before ending the iteration, the state opportunity executes all the finally blocks in the code range that are in the suspended state within the iterator scope, which is somewhat complex, but the result is easy to interpret: only using the foreach call iteration, The finally block in the iteration block executes as expected. The following can be used in code to verify the above conclusions:
ienumerable<int32> iterable = Countwithtimelimit (stop);ienumerator<int32> iterator = iterable. GetEnumerator (); iterator. MoveNext (); Console.WriteLine ("Return {0}", iterator. current); iterator. MoveNext (); Console.WriteLine ("Return {0}", iterator. Current); Console.readkey ();
The code output is as follows:
As you can see, the stop iteration is not printed, and when we manually call the iterator Dispose method, we see the following result. It is rare to terminate an iterator before the end of an iterator iteration, and to implement the iteration manually rather than using a foreach statement, and if you want to implement the iteration manually, do not forget to use the using statement outside the iterator to ensure that the Dispose method of the iterator is executed to execute the FINALLY statement block.
Here's a look at some of the special behavior of Microsoft's implementations of iterators:
2.4 Special behavior in iterator execution
If you use the C#2 compiler to compile the iteration blocks and then use Ildsam or reflector to view the generated IL code, you will find that in the behind-the-scenes compiler aftertaste we have generated some nested types (nested type). is to use Ildsam to view the generated IL, The bottom two lines are the two static methods in the code, the blue <countwithtimelimit>d_0 is the class that the compiler generated for us (the angle brackets are just the class name, and the generics are irrelevant), and the code shows which interfaces the class implements, and what methods and fields are available. This is probably similar to the structure of the iterator we implemented manually.
The real code logic is actually executed in the MoveNext method, which has a large switch statement. Fortunately, there is no need to understand these details as a developer, but some of the ways that some iterators are executed are noteworthy:
- Before the MoveNext method executes for the first time, the current property always returns the default value of the iterator return type. For example, if Ienumeratble returns a Int32 type, the default initial value is 0, so calling the current property before calling the MoveNext method returns 0.
- When the MoveNext method returns false, the current property always returns the value of the last iteration.
- The reset method generally throws an exception, and in the beginning of this article we implement the logic correctly in reset when we are implementing an iterator manually.
- The nested classes that the compiler produces for us implement both generic and non-generic versions of IEnumerator (and, when appropriate, IEnumerable generics and non-generic versions).
There is a reason why the Reset method is not implemented correctly-the compiler does not know what logic to use to set up iterators from the new one. Many people think that there should be no reset method, many collections are not supported, so callers should not rely on this method.
There is no harm in implementing other interfaces. The IEnumerable interface is returned in the method, he implements five interfaces (including IDisposable), and as a developer there is no need to worry about these. It is not uncommon to implement IEnumerable and IEnumerator interfaces at the same time, in order for the compiler to behave normally, and to create a single nested type only if it is necessary to iterate over a collection in the current thread.
The current property behaves strangely, saving the last return value of the iterator and preventing the garbage collection period from being collected.
As a result, the auto-implemented iterator method has some minor flaws, but the sensible developer does not encounter any problems, and using him can save a lot of code, making the use of iterators much wider than c#1. Here's where the iterator simplifies the code in real-world development.
3. Examples of the use of iterations in real-world development
3.1 Iteration dates from the time period
When it comes to time sections, loops are usually used, with the following code:
for (DateTime day = timetable. StartDate; Day < timetable. EndDate; Day=day. AddDays (1)) { ...}
Loops are sometimes not iterative and intuitive and expressive, in this case, it can be understood as "every day in a time zone," which is exactly what foreach uses. So if the loop is written as an iteration, the code will look more beautiful:
foreach (DateTime day in timetable. DateRange) { ...}
It takes a certain amount of effort to achieve this in c#1.0. By the c#2.0, it became easy. In the timetable class, you only need to add one property:
Public ienumerable<datetime> daterange{ get {for (DateTime day=startdate; day < =enddate; day= Day. AddDays (1)) { yield return day;}} }
It just moves the loop to the inside of the timetable class, but after this change, the encapsulation becomes better. The DateRange property simply iterates through each day of the interval, returning one day at a time. If you want to make the logic a little more complicated, just change one place. This small change makes the code much more readable, and then consider extending the range to a generic range<t>.
3.2 iterations read each row in the file
When reading a file, we often write code like this:
using (TextReader reader=file.opentext (fileName)) { String line; while ((Line=reader. ReadLine ())!=null) { ... }}
There are 4 links in this process:
- How to get TextReader
- Managing the life cycle of TextReader
- Iterate through the Textreader.readline all rows
- To process a row
This process can be improved in two ways: You can use a delegate--you can write a helper method that has reader and an agent as a parameter, use a proxy method to process each line, and finally close reader, which is often used to show closures and proxies. There is also an improvement that is more elegant and more in line with the LINQ approach. In addition to passing logic as a method parameter, we can iterate over one line of code using an iteration, so that we can use the foreach statement. The code is as follows:
Static ienumerable<string> ReadLines (String fileName) { using (TextReader reader = File.OpenText (fileName)) { String line; while (line = reader. ReadLine ()) = null) { yield return line;}} }
This allows you to read the file using the following foreach method:
foreach (String readlines ("Test.txt")) { Console.WriteLine (line);}
The body part of the method is the same as the previous one, and the yield return is used to return each row that is read, just a little bit different after the iteration ends. Previous actions, open the document, read one line at a time, and then close reader at the end of the read. This process is more pronounced when using iterations, although the use similarity is used in the "when reading ends" and the previous methods.
This is why it is so important to call the Dispose method of the iterator after the end of the foreach iteration, which ensures that reader can be freed. The using statement block in the iterative method is similar to the try/finally statement block; The finally statement executes when the end of the file is read or when we display the Dispose method that calls Ienumerator<string>. It may sometimes pass through ReadLine (). GetEnumerator () returns a resource leak by returning ienumerator<string> and manually iterating without calling the Dispose method. Typically, a foreach statement is used to iterate through loops, so this problem rarely occurs. But there is still a need to be aware of this potential problem.
The method encapsulates the first three steps, which can be a bit harsh. It is necessary to encapsulate the lifecycle and methods, and now expand, if we want to read a stream file from the network, or we want to use the UTF-8 encoding method, we need to burst the first part to the method caller, so that the method's call signature is roughly as follows:
There are so many bad places that we want to have absolute control over reader so that the caller can clean up the resources after the end. The problem is that if an error occurs before the first call to MoveNext (), then we have no chance to clean up the resource. The ienumerable<string> itself cannot be freed, and he stores a state that needs to be cleaned up. Another problem is that if GetEnumerator was called two times, we meant to return two separate iterators, and then they used the same reader. One method is to change the return type to Ienumerator<string>, but in this case, you cannot iterate with foreach, and the resource is not cleaned up without executing to the MoveNext method.
Fortunately, there is a way to solve these problems. Just as the code doesn't have to be executed immediately, we don't need reader to do it immediately. We can provide an interface implementation "If you need a TextReader, we can provide". There is a proxy in. NET 3.5, signed as follows:
Public delegate TResult Func<tresult> ()
The proxy has no parameters, and returns the same type as the type parameter. We want to get the TextReader object, so we can use FUNC<TEXTREADER> The code is as follows:
using (TextReader Reader=provider ()) { String line; while ((Line=reader. ReadLine ())!=null) { yield return line; } }
3.3 Lazy filtering of collections using iteration blocks and iteration conditions
LINQ allows for a simple and powerful way to query multiple data sources, such as memory collections or databases. Although C#2 does not integrate query expressions, lambda representations, and extension methods. But we can achieve similar results.
One of the core features of LINQ is the ability to filter data using the Where method. Provides a collection and filter agent, the result of filtering will be in the iteration by lazy matching, each matching a filter condition will return a result. It's kind of like List<t>. The FindAll method, but LINQ supports lazy evaluation of all objects that implement the Ienumerable<t> interface. Although LINQ is supported from C#3, we can also use existing knowledge to some extent to implement the where Statement of LINQ. The code is as follows:
public static ienumerable<t> where<t> (ienumerable<t> SOURCE, predicate<t> predicate) {if (Source = = NULL | | predicate = = NULL) throw new ArgumentNullException ( ); Return Whereimpl (source, predicate);} private static ienumerable<t> whereimpl<t> (ienumerable<t> source, predicate<t> Predicate) { foreach (T item in source) {if (predicate (item)) yield return item; }}ienumerable<string> lines = ReadLines ("FakeLinq.cs"); predicate<string> predicate = delegate (String line) {return line. StartsWith ("using");};
As in the code above, we have divided the entire implementation into two parts, parameter validation and specific logic. It may seem strange, but it is necessary for error handling. If you put the two-part method in a method, if the user calls where<string> (null,null), no problem will occur, at least the exception we expect is not thrown. This is due to the lazy evaluation mechanism of the iteration block. The code in the method body is not executed until the first call to the MoveNext method at the user's iteration, as seen in Section 2.2. If you want to be eager to determine the parameters of the method, then there is no place to delay the exception, which makes it difficult to track bugs. The standard practice, as in the code, divides the method into two parts, one to validate the parameters as the normal method, and the other to lazily manipulate the principal logical data using an iterative block.
The body of the iteration block is intuitive, with the Predict proxy method being used to determine elements in the collection, and returns if the condition is met. If the condition is not met, the iteration is next until the condition is met. It is difficult to implement this logic in c#1, especially when implementing its generic version.
The code that follows shows the use of the previous ReadLine method to read the data and then uses our where method to filter the rows starting with using in line, and using File.ReadAllLines and array.findall<string> The biggest difference in implementing this logic is that our approach is completely inert and streamlined (streaming). Each time only one row is requested in memory and processed, of course, if the file is small, there is no difference, but if the file is large, such as the log file on the G, the advantages of this method will be revealed.
4 Summary
C # indirectly implements many design patterns, making it easy to implement these patterns. Relatively few features are directly implemented for a particular design pattern. As seen from the foreach code, c#1 supports the iterator pattern directly, but does not effectively support the collection of iterations. It is time-consuming, error-prone and tedious to implement a correct IEnumerable for a collection. In C#2, the compiler did a lot of work for us and implemented a state machine to implement the iteration.
This article also shows a feature similar to LINQ: Filtering the collection. Ienumerable<t> one of the most important interfaces in LINQ, if you want to implement your own LINQ operations on LINQ to Object, you will heartily lament the power of this interface and the usefulness of the iteration blocks provided by the C # language.
This article also shows examples of practical projects that use iteration blocks to make your code more readable and logical, and hopefully these examples will help you understand iterations.
Detailed C # iterator "Go"