Apache Top Project Introduction 2-kafka

Source: Internet
Author: User

650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7226/ E9d40ea7-3982-3e47-8856-51eae85c41b3.jpg "title=" click to view original size picture "class=" Magplus "width=" "height=" 131 "style=" border : 0px;float:left; "/>

Apache Top Project Introduction Series-1, we start with Kafka. Why Popular + name Cool.

Kafka official website is a relatively simple, direct visit to the site, "Kafka is a high-throughput distributed messaging system." Kafka initially started LinkedIn as the foundation for LinkedIn to manage the pipline of activity streams (PV, user behavior Analysis, search) and operational data processing.

Because of its distributed and high throughput is widely used, such as with Cloudera, Hadoop, Storm, Spark etc.

Kafka first, as a message system, provides basic functions such as decoupling, sequencing, asynchrony, and so on. At the same time, high-quality design concept to support higher throughput, to provide O (1) time responsible for durability, data level of more than TB/PB, support offline and real-time processing, that is, with the Hadoop,storm docking, support horizontal scale out.

Architecture diagram:


650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7228/ 112026de-01d4-30c7-8a85-61cb4a7e89ac.png "title=" click to view original size picture "class=" Magplus "width=" "height=" 329 "style=" border : 0px; "/>
As can be seen, Kafka is a distributed architecture design (of course DT era, does not support horizontal scale out cannot survive), the former segment producer concurrent (support batch) push messages to Kafka specific topic Cluster Server broker, Each topic also contains multiple partition to facilitate horizontal scaling, and the consumer consumer through consumer group to the broker server pull to get messages. Kafka manages cluster configuration through ZK, elects leader, and rebalance. The message pattern is push/pull.

We're going to build a Kafka Cluster service:


650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7230/ 0444b5ac-0ff8-3740-a2b8-066887da03cd.jpg "title=" click to view original size picture "class=" Magplus "width=" "height=" 138 "style=" border : 0px; "/>
Send via ZK, consumer message:

650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7232/ B860c8ff-ce63-378e-b0a3-2317d4fc829e.jpg "title=" click to view original size picture "class=" Magplus "width=" "height=" style= : 0px; "/>
Use Java to produce/consume messages:

650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7234/ Bcbc8a5f-d05f-3b11-80ca-51ac78c50b11.jpg "style=" border:0px; "/>

More straightforward, here note can be sent in bulk message, not all message middleware can be sent in bulk, bulk send is one of the reasons for high throughput.

Here the stream stream is used to consume the payload, and the message flow iterator does not stop, like a listener message.

Kafka's efficiency or its innovative point:

    1. message removal management typically message middleware consumes a message, deleting a message, which makes the message very expensive to use. While Kafka uses stateless management to introduce message offsets, message time-based SLAs apply retention policies, and messages are deleted after a certain amount of time, so according to the official website, consuming Kafka messages is very lightweight: come and go. Sounds like takeout, take and go. Even with the introduction of offsets, consumers are free to get arbitrary location messages, including retrieving messages that have already been consumed.


650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7238/ 81ed9c88-16f3-3c3b-9f17-8005930b776a.png "style=" border:0px; "/>

2. Kafka using Linux sendfile to copy files from Linux kernel


650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7240/ 17ab56ed-4304-3ece-9362-326d39237b99.png "style=" border:0px; "/>

3.kafka introduces ZK, manages distributed coordination, HA, fault tolerance. ZK is used to manage Kafaka agent broker, when Kafka new or an agent fails, ZK service will inform producers and consumers.

650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7242/ 29f2fff3-9287-3b7b-ae0c-cd290c480246.png "style=" border:0px; "/>

4. Producer performance, message structure optimization size and bulk delivery


650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7244/ 355e3ba1-dae8-3e56-b779-e22eb5c590fe.png "style=" border:0px; "/>
5. Consumption This performance: message structure optimization and stateless introduction of inexpensive, no need why B + Tree index.
650) this.width=650; "Src=" http://dl2.iteye.com/upload/attachment/0117/7246/ 6ef00c74-ceb6-3415-aa73-822b7e94d411.png "style=" border:0px; "/>

In general, Kafka performance is outstanding, it is often a substitute for message middleware, if the management of Hadoop,stream is the most important. In addition, if the site log processing, users use behavioral analysis, or offline processing log, etc. are the perfect choice.

Well, first here, up early to write something, sure enough, time tight task heavy ah. I hope you all forgive, some pictures borrowed from the network.

Public number: Technical Geek techbooster


This article is from the "Erixhao" blog, make sure to keep this source http://erixhao.blog.51cto.com/10238307/1784007

Apache Top Project Introduction 2-kafka

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.