A simple example of Java multithreading

Source: Internet
Author: User

Now there is a task, there is a mobile phone number list (20W), there is a list of words (10W), to count which phone numbers do not appear in the word list, which mobile phone numbers appear in the word list more than once.

Think of the most direct way, is the two-layer loop to traverse, although this method is more stupid, but there is no better way to come up with.

Starting with a single-threaded process, the code is readily written and not refactored, just to make a simple explanation:

Package Tool;import Java.util.list;public class Singlethread{public static void Main (string[] args) {Singlethread st = new Singlethread (); String Useridpath = "D:\\shell\\store_bak\\tool\\userid.txt"; list<string> userids = Util.readuserid (Useridpath); list<string> Cdritems = Util.readcdritem (); St.process (UserIDs, cdritems);} /** *  * @param userids * @param cdritems */private void process (list<string> userids, list<string> Cdritem s) {Long startTime = System.currenttimemillis (); int count = 0;for (String key:userids) {string[] Uninkeys = Key.split ("\\s + "); count = 0;for (String cdr:cdritems) {if (Cdr.contains (" | "+ uninkeys[0] +" | ") && Cdr.contains ("|" + uninkeys[1] + "|")) {count++;}}} System.out.println ((System.currenttimemillis ()-StartTime)/1000);}}

The code in the util is not given, that is, the simple file read operation, the whole process is not fast, the most time-consuming operation on the Contains method, the first use is not the Contains method, but the use of regular expression matching, The result is that the regular expression is not efficient, so the Contains method is used instead. But the efficiency is still not ideal. So consider using multithreading to handle it.

Unlike traditional producer consumers, where there is actually only a consumer, because the raw data is almost no time consuming, the easiest way is to define a shared index, and then mutually exclusive +1 operations, so here the index is a shared variable, need to synchronize. Directly using the Atomicinteger provided in the JDK, the code is as follows:

Package Tool;import Java.util.list;import Java.util.concurrent.brokenbarrierexception;import Java.util.concurrent.cyclicbarrier;import Java.util.concurrent.atomic.atomicinteger;public class MutiThread{ private static Atomicinteger lock = new Atomicinteger (0);p ublic static void Main (string[] args) {Mutithread tool = new Muti Thread (); String Useridpath = "D:\\shell\\store_bak\\tool\\userid.txt"; list<string> userids = Util.readuserid (Useridpath); list<string> Cdritems = Util.readcdritem (); Tool.work2 (lock, UserIDs, cdritems);} public void Work2 (Atomicinteger lock, list<string> userids,list<string> cdritems) {final long startTime = System.currenttimemillis (); Cyclicbarrier cb = new Cyclicbarrier (5, new Runnable () {@Overridepublic void run () {System.out.println ( System.currenttimemillis ()-StartTime)/1000);}); for (int i = 0; i < 5; i++) {New Thread (new Worker (UserIDs, Cdritems, Lock, CB)). Start ();}} Class Worker implements Runnable{private list<string> Userids;privaTe list<string> cdritems;private atomicinteger lock;private cyclicbarrier cb;public Worker (List<String> UserIDs, list<string> Cdritems,atomicinteger Lock, Cyclicbarrier cb) {this.userids = Userids;this.cdritems = Cdritems;this.lock = LOCK;THIS.CB = cb;} @Overridepublic void Run () {while (true) {int index = lock.getandincrement (), if (Index >= userids.size ()) break; String id = userids.get (index);p rocess1 (ID, cdritems);} Try{cb.await ();} catch (Interruptedexception e) {e.printstacktrace ();} catch (Brokenbarrierexception e) { E.printstacktrace ();}}} private void Process1 (String ID, list<string> cdritems) {string[] Uninkeys = Id.split ("\\s+"); int count = 0;for (stri Ng Cdr:cdritems) {if (Cdr.contains ("|" + uninkeys[0] + "|") && Cdr.contains ("|" + uninkeys[1] + "|")) {count++;}}}}

The use of multithreading can really improve a lot of efficiency, especially when the volume of data is large, at least twice times the speed, the number of threads here is not the more the better, because the JVM on the thread of scheduling also consumes resources.

For this scenario, consider the implementation of CONCURRENTHASHMAP, the resources can be segmented processing, you can skillfully avoid multi-threaded resource requisition, so you can divide the list into different segments, to different threads to handle, the code is as follows:

Package Tool;import Java.util.list;import Java.util.concurrent.brokenbarrierexception;import Java.util.concurrent.cyclicbarrier;import Java.util.concurrent.atomic.atomicinteger;public Class Mutisegmentmutithread{private static Atomicinteger lock = new Atomicinteger (0);p rivate static int threadnum = 10;public St atic void Main (string[] args) {Mutisegmentmutithread tool = new Mutisegmentmutithread (); String Useridpath = "D:\\shell\\store_bak\\tool\\userid.txt"; list<string> userids = Util.readuserid (Useridpath); list<string> Cdritems = Util.readcdritem (); Tool.work2 (lock, UserIDs, cdritems);} public void Work2 (Atomicinteger lock, list<string> userids,list<string> cdritems) {final long startTime = System.currenttimemillis (); Cyclicbarrier cb = new Cyclicbarrier (threadnum, New Runnable () {@Overridepublic void run () {System.out.println ( System.currenttimemillis ()-StartTime)/1000);}); int segmentsize = Userids.size ()/threadnum;int start = 0;int end = 0;for (int i = 0; I &lT Threadnum; i++) {start = i * segmentsize;if (i = = ThreadNum-1) {end = Userids.size ();} Else{end = (i + 1) * segmentsize;} New Thread (New Worker (UserIDs, Cdritems, CB, start, end)). Start ();}} Class Worker implements Runnable{private list<string> userids;private list<string> cdritems;private cyclicbarrier cb;private int start;private int end;public Worker (list<string> userids, list<string> Cdritems,cyclicbarrier CB, int start, int end) {this.userids = Userids;this.cdritems = CDRITEMS;THIS.CB = Cb;this.start = S Tart;this.end = end;} @Overridepublic void Run () {for (int i = start; i < end; i++) {String id = userids.get (i);p rocess1 (ID, cdritems);} Try{cb.await ();} catch (Interruptedexception e) {e.printstacktrace ();} catch (Brokenbarrierexception e) { E.printstacktrace ();}}} private void Process1 (String ID, list<string> cdritems) {string[] Uninkeys = Id.split ("\\s+"); int count = 0;for (stri Ng Cdr:cdritems) {if (Cdr.contains ("|" + uninkeys[0] + "|") && Cdr.conTains ("|" + uninkeys[1] + "|")) {count++;}}}}

The third way in the actual test is actually faster than the second, but Ascension is not obvious. The above code is just to provide a way to solve the problem, presumably can continue to optimize, if the amount of data is very large, you can consider the use of distributed computing.

A simple example of Java multithreading

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.