Storm [topn-Sort]-rollingcountbolt

Source: Internet
Author: User
Tags emit


Background:

1: You need a preliminary understanding of the sliding window

2: You need to know the calculation process of sliding chunk in the Sliding Window process, especially every time a chunk is initiated, you need to clear it once.

Package COM. cc. storm. bolt; import Java. util. hashmap; import Java. util. hashset; import Java. util. map; import Java. util. set; import backtype. storm. task. outputcollector; import backtype. storm. task. topologycontext; import backtype. storm. topology. irichbolt; import backtype. storm. topology. outputfieldsdeclarer; import backtype. storm. tuple. fields; import backtype. storm. tuple. tuple; import backtype. storm. tuple. values; Import backtype. storm. utils. utils;/*** 1 here we need to implement a sliding window. Please note that, in the process of sliding window, we cleared the next *** @ author Yin Shuai ***/public class rollingcountbolt implements irichbolt {Private Static final long serialversionuid = 1765213339554264320l of the current sliding window.; private hashmap <object, long []> _ objectcounts = new hashmap <object, long []> (); Private int _ numbuckets; private transient thread cleaner; private outputcollector _ Collector;/*** _ trackminute * is the size of the entire sliding window. The size of the sliding window determines our time interval, that is, assume that the overall size of the sliding window is 15 minutes. * The value of the real-time sorting of product clicks is like the calculated value of product page views, that is, 15 minutes. ** the size of a single window is me, our thirty minutes have continued over time ** example: in the initial construction process, if the number of buckets is 10, the time length of a Single Window is 3. ** [], [], [], [], [], [], the statistical values are constantly changing **/private int _ trackminutes; public rollingcountbolt (INT numbuckets, int trackminutes) {This. _ numbuckets = numbuckets; this. _ trackminutes = trackminutes;} public long totalobjects (Object OBJ) {long [] curr = _ objectcounts. Get (OBJ); long Total = 0; For (long l: curr) {total + = L;} return total;} public int currentbucket (INT buckets) {return currentsecond () /secondsperbucket (buckets) % buckets;} public int currentsecond () {return (INT) (system. currenttimemillis ()/1000 );} /***** @ Param buckets * Number of buckets you set * @ return obtain the number of buckets per bucket based on our default _ trackminutes/buckets */Public int secondsperbucket (INT buckets) {return _ trackminutes * 60/buckets;} public long millisperbucket (INT buckets) {return (long) 1000 * secondsperbucket (buckets) ;}@ suppresswarnings ("rawtypes ") @ overridepublic void prepare (MAP stormconf, topologycontext context, outputcollector collector) {// todo auto-generated method stub_collector = collector; cleaner = new thread (New runnable () {@ suppresswarnings ("unchecked") @ overridepublic void run () {// todo auto-gene Rated method stubint lastbucket = currentbucket (_ numbuckets); While (true) {int currbucket = currentbucket (_ numbuckets); P ("thread while loop: The current bucket is: "+ currbucket); If (currbucket! = Lastbucket) {P ("thread while loop: the number of buckets before:" + lastbucket); int buckettowipe = (currbucket + 1) % _ numbuckets; P ("thread while loop: the bucket to be erased is: "+ buckettowipe); synchronized (_ objectcounts) {set objs = new hashset (_ objectcounts. keyset (); For (Object OBJ: objs) {long [] counts = _ objectcounts. get (OBJ); long currbucketval = counts [buckettowipe]; P ("thread while loop: erased value:" + currbucketval); counts [buckettowipe] = 0; long Total = T Otalobjects (OBJ); If (currbucketval! = 0) {P ("thread while loop: If the erased value is not 0:, the data is transmitted: OBJ total" + OBJ + ":" + total ); _ collector. emit (new values (OBJ, total);} If (Total = 0) {P ("thread while loop: After the total number is 0, delete the OBJ object "); _ objectcounts. remove (OBJ) ;}}lastbucket = currbucket;} Long Delta = millisperbucket (_ numbuckets)-(system. currenttimemillis () % millisperbucket (_ numbuckets); utils. sleep (DELTA); P ("\ n") ;}}); cleaner. start () ;}@ overridepublic void execute (tuple input) {object obj1 = input. getvalue (0); object OBJ = input. getvalue (1); int currentbucket = currentbucket (_ numbuckets); P ("execute method: Current Bucket:" + currentbucket); synchronized (_ objectcounts) {long [] curr = _ objectcounts. get (OBJ); If (curr = NULL) {curr = new long [_ numbuckets]; _ objectcounts. put (OBJ, curr);} curr [currentbucket] ++; system. err. print ("execute method: accepted merchandiseids:" + obj. tostring () + ", long array:"); For (long number: curr) {system. err. print (number + ":");} p ("execute method: transmitted data:" + OBJ + ":" + totalobjects (OBJ )); /*** pay attention to the fact that we continuously launch a product ID in the current sliding window, that is, the indicator calculation value in our time period, in the sorting process, we only target the key, that is, our product ID, therefore, the sort bolts are sent to the subsequent sorting bolts based on the information containing the time range * // each time a piece of data is sent, the _ collector will be sent once. emit (new values (OBJ, totalobjects (OBJ); _ collector. ack (input);} p ("\ n") ;}@ overridepublic void cleanup () {// todo auto-generated method stub} @ overridepublic void declareoutputfields (outputfieldsdeclarer declarer) {// todo auto-generated method stubdeclarer. declare (new fields ("merchandiseid", "Count") ;}@ overridepublic Map <string, Object> getcomponentconfiguration () {// todo auto-generated method stubreturn NULL ;} public void P (Object O) {system. err. println (O. tostring ());}}

Here, we need to pay attention to the fact that each sliding window slides a set of data. In the process of transmitting data, this group of data will be counted.

Data.


Storm [topn-Sort]-rollingcountbolt

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.