Netflix open source data Flow manager Suro

Source: Internet
Author: User
Keywords Open source real time so that these

Netflix recently launched an Open-source tool called Suro that collects event data from multiple http://www.aliyun.com/zixun/aggregation/15818.html "> Application Servers" and real-time directed delivery to target data platforms such as Hadoop and Elasticsearch. Netfix's innovation is expected to be a major data-mainstream technology.

Netflix uses Suro as a real-time guide for data sources to target hosts, Suro not only plays a key role in Netflix's data pipeline, but is also the leader in many Open-source data analysis tools that have emerged from large internet companies.

Netflix's various applications generate tens of billions of of events a day, Suro can collect them before the data is sent, then partially pass Amazon S3 to the Hadoop batch, and the other part via Apache Kafka to Druid and Elasticsearch do real-time analysis. From Netflix blog, the company is also considering how to get Suro to support storm or samza such a real-time processing engine to perform machine learning of event data.

People familiar with the big data field know that many technologies are linked to companies, such as Netflix created Suro, LinkedIn created Kafka and Samza, and Twitter created storm,metamarkets to create Druid. Suro Blog also admits it is based on the Apache Chukwa project, similar to Apache's Flume, Facebook's scribe. Admittedly, Hadoop is the most notable of these projects.

Why companies have to build their own technology has been a controversial hot spot, because their needs, generally will be created, like many things in life, but the answer to this question has to be detailed analysis of specific issues. Storm, for example, is becoming a very popular streaming tool, but LinkedIn feels it needs something different, so create Samza. Instead of using some of the existing technologies, Netflix created the Suro, mainly because the company is a heavy cloud service user (mainly based on AWS), but there are some non-AWS businesses, including the Apache Cassandra Database.

The ultimate winner of this technological innovation is that users who adopt these mainstream technologies will benefit from these open-source technologies without recruiting professionals within the company. For example, we've seen Hadoop vendors trying to get the storm and spark frameworks for their corporate customers. At the same time, we also believe that Hadoop is definitely not the last technology. AWS has a lot of users, after all, they want to suro the capabilities of this technology, rather than being bundled by AWS.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.