Big Data file Processing

Source: Internet
Author: User

When working with large data files, the "producer-consumer" threading model is used for processing, and the code is implemented as follows:

/** * File Processing class * */public class Fileprocessor {/** read the path of the file */private String path = "";/** specifies the size of the default work queue */public static final int M Axworkqueuesize = 2 << 12;/** worker thread Queue */private blockingqueue<runnable> workQueue = null;  /** data processing thread pool */private threadpoolexecutor excutor = null;public fileprocessor (String file) {This.path = File;workqueue = new Linkedblockingqueue<runnable> (maxworkqueuesize) excutor = new Threadpoolexecutor (Ten, 5 * 1000L, TimeUni T.milliseconds, workQueue);} Public fileprocessor (blockingqueue<runnable> workqueue,string file) {This.path = File;excutor = new Threadpoolexecutor (Ten, 5 * 1000L, timeunit.milliseconds, workQueue);} public void process () {/** Open file read thread */filereaderprocessor filereaderprocessor = new Filereaderprocessor (path,excutor); Excutor.execute (filereaderprocessor);} public static void Main (String []args) {fileprocessor proc = new Fileprocessor ("D://test");p roc.process ();}} /********************************************************//* * Read file thread */public class Filereaderprocessor implements Runnable {/** Read file path */private String path = "";p rivate Threadpoolexecutor excutor = null;public filereaderprocessor (String file, Threadpoolexecutor excutor) {this.path = file; This.excutor = Excutor;} @Overridepublic void Run () {//TODO auto-generated method Stubfilereader reader = null; BufferedReader br = Null;int linenumber = 0;try {reader = new FileReader (path); br = new BufferedReader (reader); String str = null;while ((str = br.readline ()) = null) {++linenumber; System.out.println ("[" + Thread.CurrentThread (). GetName () + "] read" + linenumber + "Rows"),/** prevents read-in too fast, causing the work queue to be full and unable to accept tasks, the hyper- When work queue 0.75 is over, the */if (Excutor.getqueue (). Size () >= fileprocessor.maxworkqueuesize * 0.75) {System.out.println ("[" +) is paused. Thread.CurrentThread (). GetName () + "] sleep 5 Seconds"); TimeUnit.SECONDS.sleep (5); /** sleeps for five seconds in */}excutor.submit (new Datehandlerprocessor (str));}} catch (FileNotFoundException e) {//TODO auto-generated catch blockSystem.out.println ("File not Find Error: "+ e.getmessage ());} catch (IOException e) {//TODO auto-generated catch BlockSystem.out.println ("Read File Io Error:" + e.getmessage ());} catch (Interruptedexception e) {//TODO auto-generated catch BlockSystem.out.println ("Thread Interrupt Error:" + e.getme Ssage ());} Finally {/** Close resource */this.close (BR, Reader, excutor);}} public void Close (BufferedReader br, FileReader Reader, Threadpoolexecutor executor) {try {if (br! = null) {Br.close ();} if (reader! = null) {Reader.close ();} /** Close the thread pool */while (Excutor.getqueue (). Size ()! = 0) {TimeUnit.SECONDS.sleep (1);} Excutor.shutdown (); if (!excutor.awaittermination (5 * 1000L, timeunit.milliseconds)) {Excutor.shutdownnow ();}} catch (Exception e) {System.out.println ("Close Error:" + e.getmessage ());}}} /*********************************************************//** Data Processing Classes */public class Datehandlerprocessor implements Runnable {/** handles the file row */private string line = "";p ublic Datehandlerprocessor (string lines) {this.line = lines;} @Overridepublic void Run () {//TODO auto-generated method stubtry {System.out.println ("thread[" + thread.currentthread (). GetName () + "] Get Li NE "+ line);} catch (Exception e) {//TODO auto-generated catch BlockSystem.out.println ("thread[" + thread.currentthread (). GetName () + "] Interrupt:" + e.getmessage ());}}}

  

Big Data file Processing

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.