PLINQ Summary
LINQ (LINQ) Language Integrated Query facilitates querying and processing data from different data sources. PLINQ Parallel LINQ not only features LINQ, but also interfaces for parallel operations for ease of use and increased efficiency.
More information: https://msdn.microsoft.com/zh-cn/library/dd460688 (v=vs.110). aspx
A simple example
A simple example is sufficient to illustrate the use of PLINQ.
A simple explanation:
Part I: First query a document that contains the letter "a" first query 10,000 times using LINQ, and then use PLINQ to do the same test to see how much efficiency is improved. AsParallel () is the main interface used.
Part Two: Calculate the and of all the numbers for the short type, and test the same two times using LINQ and PLINQ.
Part III: Test PLINQ's lookup sorting function. The interfaces used were asordered () and
AsOrdered: Keeps the data in the data source in the same order.
ORDER BY: Sort by user-specified sequence, ascending/Descending
Sample program:
Using System;Using System. Text;Using System. Threading;Using System. Threading. Tasks;Using System. Diagnostics;Using System. Linq;Using System. IO;Namespace sample6_1_plink_basic{class Program {static int sumdefault (int[] array) {Retur N Array. Sum();} static int Sumasparallel (int[] array) {return array. AsParallel(). Sum();} static string[] words = {"Day","Car","Land","Road","Sea","Mountain","River"};static void Main (string[] args) {var customers = System. IO. File. ReadAllLines(@"D:\testdir\CSParallel_Program\Sample6-1 plink basic\target.txt");int ncounter =0;Console. WriteLine("============================================================");Console. WriteLine("TEST NORMAL LINQ");Console. WriteLine("============================================================");var swatchpn = Stopwatch. StartNew();for (int i =0; i < 10000; i++){var normalkeyletters = from lineinchCustomers let keys = line. Split("') from keyinchKeys WHERE (key. Contains(' A ')) Select key;var normalkeylist = Normalkeyletters. ToList();Ncounter = Normalkeylist. Count();} SWATCHPN. Stop();Console. WriteLine("Word with letter A = {0}", Ncounter);Console. WriteLine("LINQ use time: {0}", SWATCHPN. Elapsed);Console. WriteLine("\ n");Console. WriteLine("============================================================");Console. WriteLine("TEST PARALLEL LINQ");Console. WriteLine("============================================================");Ncounter =0;var SWATCHP = Stopwatch. StartNew();for (int i =0; i < 10000; i++){var keyletters = from lineinchCustomers. AsParallel() Let keys = line. Split("') from keyinchKeys. AsParallel() Where (key. Contains(' A ')) Select key;var keylist = Keyletters. ToList();Ncounter = Keylist. Count();} SWATCHP. Stop();Console. WriteLine("Word with letter A = {0}", Ncounter);Console. WriteLine("PLINQ use time: {0}", SWATCHP. Elapsed);Generate Array. int[] Array = Enumerable. Range(0, short. MaxValue). ToArray();const int m =10000;var S1 = Stopwatch. StartNew();for (int i =0; i < m; i++){Sumdefault (array);} s1. Stop();var s2 = Stopwatch. StartNew();for (int i =0; i < m; i++){Sumasparallel (array);} s2. Stop();Console. WriteLine("\ n");Console. WriteLine("============================================================");Console. WriteLine("CALCULATE SUMMARY TEST");Console. WriteLine("============================================================");Console. WriteLine("Default Summary:"+ (double) (S1. Elapsed. TotalMilliseconds*1000000)/m). ToString("0.00 NS"));Console. WriteLine("Parallel Summary:"+ (double) (S2. Elapsed. TotalMilliseconds*1000000)/m). ToString("0.00 NS"));Console. WriteLine("\ n");Console. WriteLine("============================================================");Console. WriteLine("Parallel asorder Test");Console. WriteLine("============================================================");var orderwords = from WordinchWords. AsParallel(). AsOrdered() where (word. Contains(' A ')) Select Word;var orderletterlist = Orderwords. ToList();for (int i =0; i < orderletterlist.count; i++){Console. WriteLine(Orderletterlist[i]);} Console. WriteLine("\ n");Console. WriteLine("============================================================");Console. WriteLine("Parallel by Test");Console. WriteLine("============================================================");var orderbywords = from WordinchWords. AsParallel() where (word. Contains(' A ')) by Word Ascending select Word;var orderbyletterlist = Orderbywords. ToList();for (int i =0; i < orderbyletterlist.count; i++){Console. WriteLine(Orderbyletterlist[i]);} Console. ReadKey();} }}
Test results:
The test actually ran several times, and the result of the test on the word LINQ was 32-34 seconds, and PLINQ was about 16-19 seconds, about a 50% increase.
About data partitioning in PLINQ
PLINQ uses multiple tasks to process the same data source and then summarize it to make it more efficient. But the task of parallel execution should deal with which part of the data, how to divide the data reasonably, and assign it to the task is very important, it will seriously affect the efficiency. But it's not likely that developers will be able to really control what kind of zoning the PLINQ is using, and understand how its internals can help optimize program performance.
There are four ways to divide data in PLINQ:
- Scoping: Used for indexed data sources, such as arrays and lists. PLINQ looks for the IList interface of the data source, and if it does, it takes a range of partitions to decompose the data into a partition equal to the number of available logical cores. PLINQ knows exactly how large the data is and can access any of its elements directly.
- Data Block partitioning: For any data source, for non-indexed data, different tasks now have a block of data, the size of the block may not be the same.
- Interleaved partitioning: This approach optimizes the processing of data items at the top of the data source. This is used when the query includes SkipWhile and TakeWhile. At this point each task corresponds to a small group of data (for short, stripes). Tasks do not require synchronization between tasks by simply calculating the stripes that the data corresponds to.
- Hash partitioning: Primarily optimized for data comparisons. It establishes a channel between the data and the task, and data items with the same hash code are sent to the same task, so that a possible match is made in a data partition, simplifying the process of comparison and reducing the sharing of data among different tasks. When you distribute data in this way, you may have a higher overhead, but the data is more efficient when compared.
C # Parallel programming of PLINQ basic use