Getting started with Kafka

Source: Internet
Author: User
Tags zookeeper client

 

What is Kafka?

Kafka is an open-source stream processing platform developed by the Apache Software Foundation and compiled by Scala and Java. Kafka is a high-throughput distributed publish/subscribe message system that can process all the action flow data of a website with a consumer scale.

 

Basic concepts of Kafka
  • BROKER: physical concept. Each Kafka node in a Kafka cluster;

  • Topic: logical concept, the category of Kafka messages, data differentiation and isolation;

  • Partition: The basic unit of data storage in Kafka. Data of one topic is distributed and stored in multiple partitions. Each partition is ordered;

  • Replication (copy and backup): The same partition may have multiple replica, and the data between multiple replica is the same;

  • Replication leader: on multiple replica of a partitionn, a leader is required to interact with producer and consumer on the partition;

  • Replicamanager: manages information about all the partitions and copies of the current broker, processes some requests initiated by kafkacontroller, switches the copy status, adds/reads messages, and election of leader.

  Kafka concept Extension

Partition (minimum storage unit)

  • Each topic is divided into multiple partitions (partition is the basic unit of consumer storage );

  • The number of consumers is smaller than or equal to the number of partition (if multiple consumers consume the same partition, a data error occurs. All Kafka is designed in this way );

  • Each broker in the broker Group stores one or more partitions of a topic. (One broker only saves one partition. If the partition is too large, multiple brokers Save the same partition );

  • Only one consumer in the consumer group reads one or more partitions of a topic, and is the only consumer (to prevent the same partition from being consumed by multiple consumer ).

 

Replication

  • When a broker crashes in the cluster, the system can proactively provide services to replicas;

  • By default, the replication coefficient of each topic is set to 1 (that is, there is no copy by default, saving resources). You can set it separately when creating a topic.

Features:

  1. The basic unit of replication is the partition of the topic;

  2. All read and write operations are carried in the leader, and followers is used only as a backup (only the leader manages read and write operations, and other replication only supports backup );

  3. Follower must be able to copy leader data in a timely manner;

  4. Increase fault tolerance and scalability.

 

Basic Structure of Kafka

 

Kafka message structure

 

Kafka features

  • Distributed (Multi-partition, multi-copy, multi-consumer, based on zookeeper scheduling );
  • High Performance (high throughput, low latency, high concurrency, time complexity is O (1 ));
  • Persistence and scalability (data persistence, failover rate, online horizontal scaling, automatic message balancing ).

 

Kafka application scenarios

Message Queue, behavior tracking, metadata monitoring, log collection, stream processing, event source, persistent log, and so on.

 

Kafka installation (in Linux)

JDK and zookeeper must be installed.

 

Zookeeper installation:

1. Download, decompress, and configure:

Wget http://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.4.12/zookeeper-3.4.12.tar.gztar-zxvf zookeeper-3.4.12.tar.gz.
# Copy zoo_sample.cfg to zoo. cfg in zookeeper-3.4.12/Conf
CP zoo_sample.cfg zoo. cfg
# Modify the following two lines in the zoo. cfg file (the folder mentioned after datadir and datalogdir must exist. If it does not exist, an error is returned when the zookeeper server is started. This is the configuration of a single machine. If it is a cluster, add the Server IP address under the clientport. For example, server.1 = 192.168.180.132: 2888: 3888
Server.2 = 192.168.180.133: 2888: 3888... and so on .)
Datadir =/tmp/zookeeper
Datalogdir =/tmp/zookeeper/log

2. Configure environment variables (permanent change mode for all users ):

Modify the/etc/profile file and add it at the end:

ZOOKEEPER_INSTALL=/usr/local/zookeeper-3.4.12 PATH=$PATH:$ZOOKEEPER_INSTALL/bin  export ZOOKEEPER_INSTALLexport PATH

3. Start the test:

# Go To The bin directory of zookeeper and start. /zkserver. sh start # view the status. /zkserver. sh status # Start the zookeeper client (the-server parameter is not required locally ). /zkcli. sh-server 192.168.147.128: 2181

Note: If the connection is rejected, check the firewall configuration.

 

Getting started with Kafka

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.