Real-time data analysis frameworks (or stream system)

Source: Internet
Author: User

Recent work involves designing a system to monitor the status of the system in real time, such as the execution of hadoop tasks and the health of the server. This system needs to process the information generated by the object in real time and send it to the user.

This system obviously requires the following features:

  1. Reliability
  2. Big Data Processing
  3. Real-time

Obviously, this is a hadoop-based project.

Kafka: Kafka is a messaging system that was originally developed at LinkedIn to serve as the foundation for LinkedIn's activity stream processing pipeline.

Nice talk

S4: S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams
Of data.

Hedwig: Hedwig is
Publish-subscribeSystem designed to carry large amounts of data processing ss the Internet in
Guaranteed-deliveryFashion from those who produce it (Publishers) To those who are interested in it (Subscribers).

Storm: storm is a distributed, reliable, and fault-tolerant stream processing system. Its use cases are so broad that we consider it to be a fundamental new primitive for Data
Processing.
Introduction slide

Flume: Apache flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications
To Apache hadoop's HDFS.

Scribe: scribe is a server for aggregating streaming log data. It is designed to scale to a very large number of nodes and be robust to network and node failures.

As the project follows up, I will continue to update it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.