Storm common mode-Batch Processing

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When storm processes streaming data in real time, a common scenario is to process a certain number of tuple tuples in batches, instead of processing a tuple immediately every time a tuple is received. This may be a performance consideration, or the specific business needs.

For example, to query or update a database in batches, if each tuple generates an SQL statement to execute a database operation, the efficiency will be much lower when the data volume is large, affecting the system throughput.

Of course, if you want to use storm's reliable data processing mechanism, you should use containers to cache the references of these tuple into the memory until the tuple is processed in batches.

The following is a simple example:CodeExample:

Now, suppose we already have a dbmanager database operation interface class, which has at least two interfaces:

(1) getconnection (): returns a java. SQL. connection object;

(2) getsql (tuple): Generate database operation statements based on tuple tuples.

To Cache a certain number of tuple in bolts, when constructing bolts, the int n parameter is passed to the int count member variable assigned to bolts, and each n tuple is specified for batch processing.

At the same time, to cache tuple in the memory, the concurrent1_queue in Java concurrent is used to store tuple. Each time count tuple is collected, batch processing is triggered.

In addition, because the data volume is small (for example, the Count tuple is not enough for a long time) or the count is too large, a timer is added to bolt, ensure that tuple can be processed at most once every 1 second.

The following is the complete bolt code (for reference only ):

 Import  Java. util. Map;  Import  Java. util. Queue;  Import  Java. util. Concurrent. concurrent1_queue;  Import  Java. SQL. connection;  Import  Java. SQL. sqlexception;  Import  Java. SQL. statement; Import  Backtype. Storm. task. outputcollector;  Import  Backtype. Storm. task. topologycontext;  Import  Backtype. Storm. topology. irichbolt;  Import  Backtype. Storm. topology. outputfieldsdeclarer;  Import  Backtype. Storm. tuple. tuple;  Public   Class Batchingbolt Implements  Irichbolt { Private   Static   Final   Long Serialversionuid = 1l ;  Private  Outputcollector collector;  Private Queue <tuple> tuplequeue = New Concurrent1_queue <tuple> ();  Private   Int  Count;  Private  Long  Lasttime;  Private  Connection conn;  Public Batchingbolt ( Int  N) {count = N; //  Number of tuple records processed in batches Conn = dbmanger. getconnection (); //  Get database connection through dbmanager Lasttime = system. currenttimemillis (); //  Timestamp of the last batch processing } @ Override  Public   Void  Prepare (MAP stormconf, topologycontext context, outputcollector collector ){  This . Collector = Collector;} @ override  Public   Void  Execute (tuple) {tuplequeue. Add (tuple );  Long Currenttime = System. currenttimemillis ();  // Each Count tuple is submitted in batches, or once every 1 second.          If (Tuplequeue. Size ()> = count | currenttime> = lasttime + 1000 ) {Statement stmt = Conn. createstatement (); Conn. setautocommit (  False  );  For ( Int I = 0; I <count; I ++ ) {Tuple Tup = (Tuple) tuplequeue. Poll (); string SQL = Dbmanager. getsql (Tup ); // Generate SQL statements Stmt. addbatch (SQL ); //  Add SQL Collector. Ack (Tup ); //  ACK  } Stmt.exe cutebatch ();  //  Batch submit SQL statements  Conn. Commit (); Conn. setautocommit (  True  ); System. Out. println ( "Batch insert data into database, total records:" + Count); lasttime =Currenttime ;}@ override  Public   Void  Cleanup () {}@ override  Public   Void  Declareoutputfields (outputfieldsdeclarer declarer) {}@ override  Public Map <string, Object> Getcomponentconfiguration (){  //  Todo auto-generated method stub          Return   Null ;}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Storm common mode-Batch Processing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Storm common mode-Batch Processing

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support