Kafka Development Practice (i)-Introductory article

Source: Internet
Author: User
Tags commit
Overview

1. Introduction

Kafka official website is described as follows:

Apache Kafka is publish-subscribe messaging rethought as a distributed
Commit log.

Apache Kafka is a high-throughput distributed messaging system, open source by LinkedIn.

"Publish-subscribe" is the core idea of Kafka design, and is also the most characteristic place of Kafka. In the Kafka ecosystem Publish is a producer role, subscribe is the role of consumer, as in our lives, manufacturers produce products, consumers are generally not able to go directly to the factory to buy, but also need an agent dealer, So the same in the Kafka ecosystem, there is a broker role. So the Kafka ecosystem can be broadly described as follows:
"Producer-> Broker <-consumer"

Excerpt from a section of Kafka official website introduction

Kafka is a distributed, partitioned, replicated commit log service. It
Provides the functionality of a messaging system, but with a unique
Design. What does all that mean?

First let's review some basic messaging Terminology:kafka maintains feeds of messages in categories called topics. We'll call processes this publish messages to a Kafka topic producers. We'll call processes this subscribe to topics and process the feed of published messages consumers. Kafka is run as a cluster comprised of one or more servers each of the which is called a broker.

So, at-a high, producers send messages over the network to the
Kafka cluster which in turn serves them-to-consumers like this:

Kafka provides a JMS-like feature, but is completely different from the design implementation. Kafka the message flow is categorized according to topic, the sender of the message is called producer, the Subscriber is called Consumer, and the Kafka cluster has one or more servers, each of which is called a broker.

2.Topics and Logs
A topic is a category or the feed name to which messages is published. For each topic, the Kafka cluster maintains a partitioned log that looks like this:

Each partition is a ordered, immutable sequence of messages, is continually appended to-a commit log. The messages in the partitions is each assigned a sequential ID number called the offset this uniquely identifies each me Ssage within the partition.

3.Distribution

The partitions of the log is distributed over the servers in the Kafka cluster with each server handling data and request s for a share of the partitions. Each partition are replicated across a configurable number of servers for fault tolerance.

4.Producers

Producers publish data to the topics of their choice. The producer is responsible for choosing which message to assign to which partition within the topic. This can is done with a round-robin fashion simply to balance load or it can is done according to some semantic partition fu Nction (say based on some key in the message). The partitioning in a second.

5.Consumers

Messaging traditionally has a models:queuing and publish-subscribe. In a queue, a pool of consumers may read from a server and each message goes to one of the them; In Publish-subscribe the message was broadcast to all consumers. Kafka offers a single consumer abstraction that generalizes both of these-the consumer group. Pros and cons

1. Advantages High performance and high throughput support message infinite accumulation, message stored on disk, access complexity of O (1) Support multi-partition, the same partition guarantee message order distributed, easy to expand

2. Disadvantages Kafka does not guarantee a strict message order. The message may be lost, the broker does not have a copy mechanism, and the broker's messages will not be available once the broker goes down. No message acknowledgement mechanism, which messages have been consumed by consumer to maintain the usage scenario

Kafka as a good news middleware for some conventional messaging system is a good choice, in addition to partitons/replication and fault tolerance, can make Kafka has good scalability and performance advantages. Kafka is widely used in log analysis system, Web browsing data tracking and so on. So far, however, we should be aware that Kafka does not provide enterprise-class features such as "transactional", "Message transmission guarantee (message acknowledgement mechanism)" in JMS, and RABBITMQ and ROCKETMQ in this regard.

For more information about Kafka please read the Kafka official website introduction.

Reference:
Kafka Getting Started: Introduction, usage scenarios, design principles, major configurations, and cluster setup
ROCKETMQ vs. Kafka (18 differences)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.