This article is translated from Jon Skeet's blog series"Edulinq".
Original article address:
Http://msmvps.com/blogs/jon_skeet/archive/2010/12/27/reimplementing-linq-to-objects-part-9-selectmany.aspx
The following operator is implemented:The most important operator. Majority (or all ?) Other operators that return a sequence can callSelectworkflow. Now let's implement it first.
SelectworkflowWhat is it?
SelectworkflowThere are four reloads, which look more scary than one:
Public static ienumerable <tresult> selectult <tsource, tresult> (
This ienumerable <tsource> source,
Func <tsource, ienumerable <tresult> selector)
Public static ienumerable <tresult> selectult <tsource, tresult> (
This ienumerable <tsource> source,
Func <tsource, Int, ienumerable <tresult> selector)
Public static ienumerable <tresult> selectult <tsource, tcollection, tresult> (
This ienumerable <tsource> source,
Func <tsource, ienumerable <tcollection> collectionselector,
Func <tsource, tcollection, tresult> resultselector)
Public static ienumerable <tresult> selectult <tsource, tcollection, tresult> (
This ienumerable <tsource> source,
Func <tsource, Int, ienumerable <tcollection> collectionselector,
Func <tsource, tcollection, tresult> resultselector)
In fact, it is not too bad. These reloads are only different forms of the same operation.
Any overload requires an input sequence. Then, a delegate is used to process each element in the input sequence to generate a subsequence. This delegate may accept a representative element.Index.
Then, we can directly return the elements in each subsequence, or use another delegate for processing. This delegate accepts the elements in the input sequence and accepts the elements in the corresponding subsequence.
In my experienceIndexThe two reloads are not commonly used, while the other two reloads (the first and third ones listed above) are usually used. Also, whenC #The compiler processes multipleFromClauseFromOtherFromThe clause is translated into the third overload above.
To put the above statement into the instance for understanding, we suppose there is a query expression:
VaR query = from file in directory. getfiles ("logs ")
From line in file. readlines (file)
Select path. getfilename (File) + ":" + line;
The preceding query expression is translated into the following "normal" call:
VaR query = directory. getfiles ("logs ")
. Selectlines (file => file. readlines (file ),
(File, line) => path. getfilename (File) + ":" + line );
In this example, the compiler willSelectClause translation is a projection operation. If the expression is followedWhereClause or other clause, the compiler willFileAndLineThe package is passed to the projection operation in an anonymous type. This is the most difficult to understand in the Translation of Query expressions, because it involves transparent Identifiers (Transparent identifiers). For now, we only analyze the simple example given above.
In the above exampleSelectworkflowThree parameters are accepted:
L input sequence, that is, a string sequence (Directory. getfilesReturned file name)
L an initial projection operation that converts a file name to a string containing a row of rows in the file
L an end projection operation that converts a file name and a line of file content into a string separated by a colon
The final result of an expression is a string sequence containing allLogEach row of the file is prefixed with the file name. If you print the result, it will probably be like this:
Test1.log: foo
Test1.log: bar
Test1.log: Baz
Test2.log: Second Log File
Test2.log: Another line from the second log file
UnderstandingSelectworkflowIt may be worth a bit of effort when I understand it, but it is very important to understand it.
Before the test, I have some questions aboutSelectworkflowThe behavior details must be described as follows:
L parameter verification is executed immediately, and each parameter cannot beNull
L the whole process is stream processing. Each time, only one element is read from the input sequence and a subsequence is generated. Then, only one element in the subsequence is returned each time. After all the elements in the subsequence are returned, the next element in the input sequence is read and used to generate the next subsequence.
L each iterator will be closed after use, as expected.
What should we test?
I'm a little lazy. I don't want to write the ParameterNull. MeSelectworkflowEvery overload of is written into a test. I found that I could not clearly write these tests, but I would like to give an example.CodeForSelectworkflowThe most complex heavy load testing:
[Test]
Public void flattenwithprojectionandindex ()
{
Int [] numbers = {3, 5, 20, 15 };
VaR query = numbers. selectmany (x, index) => (x + index). tostring (). tochararray (),
(X, c) => X + ":" + C );
// 3 => "3: 3"
// 5 => "5: 6"
// 20 => "20: 2", "20: 2"
// 15 => "15: 1", "15: 8"
Query. assertsequenceequal ("3: 3", "5: 6", "20: 2", "20: 2", "15: 1", "15: 8 ");
}
Explain this test:
L each number is added with its serial number.(3 + 0, 5 + 1, 20 + 2, 15 + 3)
L The result of the addition is converted into a string and then into a character array. (We didn't need to callTochararrayBecauseStringImplemented by itselfIenumerable <char>But now it is clear .)
L then combine each character in the sub-sequence with the original element in the form of "original element: Sub-sequence character"
The comment part is the output result corresponding to each input element. The final code of the test shows the complete output sequence.
Is it messy? I hope you can see a clear explanation of the above step-by-step decomposition. Now, let's try to pass the test.
Start implementation!
We can achieve this by implementing one of the most complex overloading and making other overloading call it.SelectworkflowOr you can write a"Impl"Method, and then let all four reloads call it. For example, the simplest overload can be implemented as follows:
Public static ienumerable <tresult> selectult <tsource, tresult> (
This ienumerable <tsource> source,
Func <tsource, ienumerable <tresult> selector)
{
If (Source = NULL)
{
Throw new argumentnullexception ("Source ");
}
If (selector = NULL)
{
Throw new argumentnullexception ("selector ");
}
Return selectmanyimpl (source,
(Value, index) => selector (value ),
(Originalelement, subsequenceelement) => subsequenceelement );
}
However, I chose to write an identical signature For each load.Selectmanyimpl"Method. I think this can simplify the subsequent single-step debugging....In this way, we can note the differences between different loads. The Code is as follows:
// Simplest overload
Private Static ienumerable <tresult> selectmanyimpl <tsource, tresult> (
Ienumerable <tsource> source,
Func <tsource, ienumerable <tresult> selector)
{
Foreach (tsource item in source)
{
Foreach (tresult result in selector (item ))
{
Yield return result;
}
}
}
// Most complicated overload:
//-Original projection takes index as well as value
//-There's a second projection for each original/subsequence element pair
Private Static ienumerable <tresult> selectmanyimpl <tsource, tcollection, tresult> (
Ienumerable <tsource> source,
Func <tsource, Int, ienumerable <tcollection> collectionselector,
Func <tsource, tcollection, tresult> resultselector)
{
Int Index = 0;
Foreach (tsource item in source)
{
Foreach (tcollection collectionitem in collectionselector (item, index ++ ))
{
Yield return resultselector (item, collectionitem );
}
}
}
The similarity between the two methods is obvious....However, I still think it is useful to keep the first form.SelectworkflowYou can easily understand the function of using the first simplest overload. Based on this, we can understand the remaining heavy loads, so the jump will not be so great. To some extent, the first overload can be understood.SelectworkflowConcept of stepping stone.
There are two points to note:
IfC #You can useYield foreach selector (item)"If this expression is used, the first method above can be implemented a little simpler. If you want to use this method in the second method, it will be difficult and may involveSelectIn this case, it is a little less than worth the candle.
In the second method, I did not explicitly use"Checked"Code block, although"Index"Is likely to overflow. I have not readBclBut I don't think they will write"Checked. Considering the consistency, I may have to handleIndexThe"Checked"Code block, or to the entireProgramSet"Checked".
By callingSelectworkflowTo implement other operators
I mentioned a lot beforeLINQAll operators can be calledSelectworkflow. The following code is an example of this opinion.SelectworkflowImplementedSelect,WhereAndConcat:
Public static ienumerable <tresult> select <tsource, tresult> (
This ienumerable <tsource> source,
Func <tsource, tresult> selector)
{
If (Source = NULL)
{
Throw new argumentnullexception ("Source ");
}
If (selector = NULL)
{
Throw new argumentnullexception ("selector ");
}
Return source. selectiterator (x => enumerable. Repeat (selector (x), 1 ));
}
Public static ienumerable <tsource> where <tsource> (
This ienumerable <tsource> source,
Func <tsource, bool> predicate)
{
If (Source = NULL)
{
Throw new argumentnullexception ("Source ");
}
If (predicate = NULL)
{
Throw new argumentnullexception ("Predicate ");
}
Return source. selectiterator (x => enumerable. Repeat (x, predicate (x )? 1: 0 ));
}
Public static ienumerable <tsource> Concat <tsource> (
This ienumerable <tsource> first,
Ienumerable <tsource> second)
{
If (first = NULL)
{
Throw new argumentnullexception ("first ");
}
If (second = NULL)
{
Throw new argumentnullexception ("second ");
}
Return new [] {first, second}. selectmany (x => X );
}
select and selectstrap use enumerable. repeat allows you to easily create a sequence that contains or does not contain any element. You can also create an array instead of repeat . Concat An array is used directly: If you understand selectstyle is used to combine multiple sequences into one, Concat the implementation looks natural. I estimate Empty and repeat it can be achieved through recursion, although the performance will be poor.
now, the above Code is placed in the Conditional compilation block. If you want me to write more with the help of select.pdf to implement the operator, I may consider separating it from a project. However, I feel that the above Code is sufficient to show selectmany Flexible, reuse selectstrap to implement more other operators may not fully illustrate this point.
In the theoretical sense,SelectworkflowIt is also important because it isLINQProvidedMonadic. I don't want to talk more about this topic. You can read it.Wes DyerBlogOr directly search for"Bind monad selectmanyYou can find a lot of people who are smarter than me.Article.
Conclusion
SelectworkflowYesLINQIt seems daunting at first. But once you understandSelectworkflowIt is easy to understand after combining multiple sequences.
Next time we discussAllAndAnyThese two operators can be put together to explain.