Demo Run for Summingbird (Storm + Hadoop)

Last Update:2015-12-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Objective

In order to run Summingbird demo, the author has gone a lot of detours, and in the country is basically not access to any information, took a long time to fix the demo run. Really is a bitter tears, interested in want to study Summingbird and listen to the author of the one by one Tao, the general can be summingbird understand as Storm + Hadoop.

A quick preview of Big Data processing

The advent of the era of big data, the large-scale processing is divided into batch processing and real-time processing two directions, the advantage of batch processing is good fault-tolerant, because the data when there is a local or distributed storage, you can repeat the data processing, the disadvantage is that the speed is slow, to wait until the data are all deposited before the batch processing. For real-time processing, the advantage is fast, real-time calculation, disadvantage is bad fault tolerance, because the data flow into the memory and then out, filtering out useful data, rather than all the data to disk processing, so when you want to run the previous data is impossible, that is, its processing data is not available. Batching or real-time processing is becoming more and more difficult to meet the diverse needs, it is bound to combine the two to deal with. It maintains the fault tolerance of batch processing, and maintains real-time processing in real time. The following is the protagonist of this article-summingbird, Seamless integration of batch computing and real-time computing.

Second, learning Summingbird need to build the environment

The author of the Machine OS for Linux, to run Summingbird, build the environment of the machine is as follows:

1.zookeeper

2.kafka

3.memcached

Second, the skills needed to learn Summingbird

1. There should be some understanding of SBT

2. Familiarity with the Scala language

3. How storm and Hadoop work should be more familiar

Third, the exploration of the demo run

Interested Park friends can search for Summingbird on GitHub and have a general understanding of them. Of course, you can follow the official GitHub tutorial to run the demo, if successful, there will be no results, because the existence of GFW, leading to the official tutorial of the Twitter stream will not be able to successfully access the program. So certainly is not running, the author just started when also tried, have failed, and then constantly Google, and on Twitter constantly asked the project initiator. and began to try again, and ended in failure. Then GitHub found an example that combines storm and Hadoop, so the heart is a happy, continue to start research, follow the step by step, and finally, the result has failed. See the error is that because some of the jar package is not available, or GFW, not in the Twitter Maven repository to obtain the corresponding jar package, because the author did not study maven and SBT, then began to learn SBT and Maven, of course, there is no special in-depth study, Just master some basic usage and be able to read SBT files and maven files. After opening the project's SBT file, it was found that the library on which it depended was walled and began to change to the MAVEN repository in Oschina.

Four, finally successfully run

Specific project code has been hosted on GitHub above, just follow the steps, you can get the correct results, but also hope that you can have a lot of advice. The next step is to start importing data from the local database for processing.

Five, experience

Learning Big Data involved in the knowledge is really very broad, to master a lot, so it must be a solid research. I have to say that China's firewall gfw is indeed the reputation, the bad. While protecting the network, it does give developers some unnecessary trouble. However, the final success of the operation.

The GitHub path is as follows: Https://github.com/leesf/summingbird-hybrid-example-china

You are also welcome to the Park Friends Fork and add star

Demo Run for Summingbird (Storm + Hadoop)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Demo Run for Summingbird (Storm + Hadoop)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Demo Run for Summingbird (Storm + Hadoop)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support