Multithreading programming learning notes-using concurrent sets (3), multithreading programming learning notes

Source: Internet
Author: User

Multithreading programming learning notes-using concurrent sets (3), multithreading programming learning notes

Multi-thread programming learning notes-using concurrent sets (1)

Multi-thread programming learning notes -- using concurrent sets (2)

 

 

4. Use ConcurrentBag to create an extensible Crawler

 

This example shows how to expand the workload among multiple workers that can produce and consume tasks independently.

 

1. The program code is as follows.

Using System; using System. collections. generic; using System. linq; using System. text; using System. threading. tasks; using System. collections. concurrent; using System. diagnostics; using System. threading; namespace ThreadCollectionDemo {class Program {static Dictionary <string, string []> contextItems = new Dictionary <string, string []> (); static void Main (string [] args) {Console. writeLine (string. format ("----- ConcurrentBag operation ----"); CreateLinks (); Task task = RunBag (); task. wait (); Console. read ();} static async Task RunBag () {var taskBag = new ConcurrentBag <CrawlingTask> (); string [] urls = new string [] {"http://www.163.com ", "http://www.jd.com", "http://www.hexun.com ",
" http://www.tmall.com "," http://www.qq.com "}; Var crawlers = new Task [5]; for (int I = 1; I <= 5; I ++) {string Crawler name =" Crawler "+ I. toString (); taskBag. add (new crawler Task {UrlToCraw = urls [I-1], ProductName = "root"}); crawler [I-1] = Task. run () => Craw (taskBag, crawler name);} await Task. whenAll (crawlers);} static async Task Craw (ConcurrentBag <crawler task> bag, string crawler name) {crawler Task task; while (bag. tryTa Ke (out task) {Console. writeLine ("{0} url retrieved from ConcurrentBag, previous node {1}, name {2}", task. urlToCraw, task. productName, crawler name); IEnumerable <string> urls = await GetLinksFromContent (task); if (urls! = Null) {foreach (var url in urls) {var t = new crawler task {UrlToCraw = url, ProductName = crawler name}; bag. add (t) ;}} if (task! = Null) {Console. writeLine ("add url {0} To ConcurrentBag, thread name {1}, crawler name {2}", task. urlToCraw, task. productName, crawler name);} else Console. writeLine ("task is null");} static async Task <IEnumerable <string> GetLinksFromContent (crawler task TASK) {await GetRandomDely (); if (contextItems. containsKey (task. urlToCraw) return contextItems [task. urlToCraw]; return null;} static void CreateLinks () {contextItems [" http://www.163.com "] = New [] {" http://www.163.com /A.html "," http://www.163.com /B .html "}; contextItems [" http://www.jd.com "] = New [] {" http://www.jd.com /A.html "," http://www.jd.com /B .html "}; contextItems [" http://www.qq.com "] = New [] {" http://www.qq.com /1.html "," http://www.qq.com /2.html "," http://www.qq.com /3.html "," http://www.qq.com /4.html "}; contextItems [" http://www.tmall.com "] = New [] {" http://www.tmall.com /A.html "," http://www.tmall.com /B .html "}; contextItems [" http://www.hexun.com "] = New [] {" http://www.hexun.com /A.html "," http://www.hexun.com /B .html "," http://www.hexun.com /C.html "," http://www.hexun.com /D.html "};} static Task GetRandomDely () {int dely = new Random (DateTime. now. millisecond ). next (150,600); return Task. delay (dely) ;}} class crawler task {public string UrlToCraw {get; set;} public string ProductName {get; set ;}}}

 

 

2. The program running result, such.

 

 

 

This program simulates the use of multiple Web Crawlers for Web indexing. At the beginning, we defined a dictionary containing webpage URLs. This dictionary simulates webpages that contain links to other pages. This implementation is very simple and does not care about the pages that have been accessed by the index, but because of its simplicity, we can focus on parallel workloads.

 

Then, a package is created and sent, including a crawler task. We have created four crawlers and provided each crawler with a different website root URL. Then wait for all crawlers to complete their work. Now, each crawler starts to check the URL of the website provided to it. We wait for a random event to simulate network I/O processing. If the page contains more URLs, the more tasks the crawler puts into the package. Then, check whether there are any crawler tasks in the package. If not, the crawler is finished.

 

If you check the output of the first line after the first four URLs, we will see that the task placed by crawler N is processed by the same crawler. However, the following rows will be different. This is because ConcurrentBag is optimized for scenarios where multiple threads can add and delete elements. The implementation method is that each thread uses its own native queue element, so no lock is required when using this queue. Only when there are no elements in the local queue, we can perform some locking operations and try to "steal" from the local queue of other threads. This behavior is intended to distribute work among all workers and prevent the use of locks.

 

 

 

5. asynchronous Processing Using BlockingCollention

 

This example shows how to use BlockingCollection to simplify the workload for asynchronous processing.

 

1. The program code is as follows.

Using System; using System. collections. generic; using System. linq; using System. text; using System. threading. tasks; using System. collections. concurrent; using System. diagnostics; using System. threading; namespace ThreadCollectionDemo {class Program {static void Main (string [] args) {Console. writeLine (string. format ("----- BlockingCollection operation ----"); Console. writeLine (string. format ("----- BlockingCollect Ion operation queue ---- "); Task task = RunBlock (); Task. wait (); Console. writeLine (string. format ("----- BlockingCollection operation Stack ----"); task = RunBlock (new ConcurrentStack <CustomTask> (); task. wait (); Console. read ();} static async Task RunBlock (IProducerConsumerCollection <CustomTask> collection = null) {string name = "queue"; var taskBlock = new BlockingCollection <CustomTask> (); if (collection! = Null) {taskBlock = new BlockingCollection <CustomTask> (collection); name = "stack";} var taskSrc = Task. run () => TaskProduct (taskBlock); Task [] process = new Task [4]; for (int I = 1; I <= 4; I ++) {string processId = I. toString (); process [I-1] = Task. run () => TaskProcess (taskBlock, name + processId);} await taskSrc; await Task. whenAll (process);} static async Task TaskProduct (BlockingCollection <CustomTask> block) {for (int I = 0; I <20; I ++) {await Task. delay (50); var workitem = new CustomTask {Id = I}; block. add (workitem); Console. writeLine (string. format ("add {0} element to BlockingCollection", workitem. id);} block. completeAdding ();} static async Task TaskProcess (BlockingCollection <CustomTask> collection, string name) {await GetRandomDely (); foreach (var item in collection) {Console. writeLine (string. format ("--- Task {0} processing operation name: {1} ---", item. id, name); await GetRandomDely () ;}} static Task GetRandomDely () {int dely = new Random (DateTime. now. millisecond ). next (1, 1000); return Task. delay (dely );}}}

 

 

2. The program running result, such.

 

 

 


 

First, let's talk about the first scenario. Here we use the BlockingCollection class, which brings many advantages. First, we can change the way tasks are stored in a blocking set. By default, it uses the ConcurrentQueue container, but we can use any set of IProducerConsumerConllection generic interfaces.

 

The worker calls the GetConsumingEnumerable method to obtain the work item by iteratively blocking the set. If there are no elements in this set, the iterator will block the working thread until an element is put into the set. The iteration cycle ends when the Completedding of the set is called only during production. This marks the completion of the work.

 

The workload producer inserts the task to BlockingCollection and then calls the CompleteAdding method, which uses all workers to complete the work. Now we can see two result sequences in the program output, demonstrating the differences between the concurrent queue and the stack set.

 

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.