Ubuntu16.04 Installing the Kafka cluster

Last Update:2017-09-21 Source: Internet

Author: User

Tags zookeeper git clone

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Download

Http://kafka.apache.org/downloads.html

Http://mirror.bit.edu.cn/apache/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/config# vim server.properties

broker.id=2 each node is different

log.retention.hours=168

message.max.byte=5242880

default.replication.factor=2

replica.fetch.max.bytes=5242880

zookeeper.connect=master:2181,slave1:2181,slave2:2181

Copy to another node

Note To create the/kafka node in ZK beforehand, the error will be reported: Java.lang.IllegalArgumentException:Path length must be > 0

[Email protected]:/usr/local/zookeeper-3.4.9# bin/zkcli.sh-server Master

[Zk:master (CONNECTED) 7] Create/kafka "

Created/kafka

[Zk:master (CONNECTED) 8] LS/

[Cluster, controller, Controller_epoch, brokers, zookeeper, Kafka, admin, Isr_change_notification, consumers, latest_ Producer_id_block, Config]

[Zk:master (CONNECTED) 9] Ls/kafka

[]

Start Kafka in daemon mode

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# nohup bin/kafka-server-start.sh Config/server.properties &

Create topic:

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--create--zookeeper master:2181-- Replication-factor 1--partitions 1--topic test

Created topic "Test".

List all topic:

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--list--zookeeper master:2181

Test

Send Message

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-console-producer.sh--broker-list master:9092--topic Test

>this is a message

>this is ant^h message

Consumer News

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-console-consumer.sh--zookeeper master:2181--topic test --from-beginning

Using the Consoleconsumer with old consumer is deprecated and'll be removed in a future major release. Consider using the new consumer by passing [Bootstrap-server] instead of [zookeeper].

This is a message

This is an message

View cluster status information

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--describe--zookeeper slave1:2181--topic My-replicated-topic

Topic:my-replicated-topic partitioncount:1 replicationfactor:3 configs:

Topic:my-replicated-topic partition:0 leader:3 replicas:1,3,2 isr:3,2

Installing Kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# git clone https://github.com/yahoo/kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager# cd kafka-manager/

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager#./SBT Clean Dist

[Success] Total time:3453 S, completed 7, 8:48:15 PM

A packaged file exists

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager/target/universal# ls

Kafka-manager-1.3.3.12.zip tmp

Modifying the Kafka-manager configuration file

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# vim conf/application.conf

Kafka-manager.zkhosts= "192.168.117.243:2181,192.168.117.45:2181,192.168.117.242:2181"

Start Kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# bin/kafka-manager-dconfig.file=conf/ Application.conf

Recommended starting mode:

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# nohup bin/kafka-manager-dconfig.file= conf/application.conf-dhttp.port=7778 &

http://192.168.117.243:7778/

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# Netstat-antlup | grep 7778

TCP6 0 0::: 7778:::* LISTEN 100620/java

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# bin/kafka-manager-dconfig.file=conf/ Application.conf

This application is already running (Or delete/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12/running_pid file) .

Stop Kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# rm running_pid

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# cd kafka-manager-1.0-snapshot/

Production server Configuration

# Replication Configurations

Num.replica.fetchers=4

replica.fetch.max.bytes=1048576

replica.fetch.wait.max.ms=500

replica.high.watermark.checkpoint.interval.ms=5000

replica.socket.timeout.ms=30000

replica.socket.receive.buffer.bytes=65536

replica.lag.time.max.ms=10000

replica.lag.max.messages=4000

controller.socket.timeout.ms=30000

controller.message.queue.size=10

# LOG Configuration

Num.partitions=8

message.max.bytes=1000000

Auto.create.topics.enable=true

log.index.interval.bytes=4096

log.index.size.max.bytes=10485760

log.retention.hours=168

log.flush.interval.ms=10000

log.flush.interval.messages=20000

log.flush.scheduler.interval.ms=2000

log.roll.hours=168

log.retention.check.interval.ms=300000

log.segment.bytes=1073741824

# ZK Configuration

zookeeper.connection.timeout.ms=6000

zookeeper.sync.time.ms=2000

# Socket Server Configuration

Num.io.threads=8

Num.network.threads=8

socket.request.max.bytes=104857600

socket.receive.buffer.bytes=1048576

socket.send.buffer.bytes=1048576

Queued.max.requests=16

fetch.purgatory.purge.interval.requests=100

producer.purgatory.purge.interval.requests=100

Kafka is a high-throughput distributed subscription-based Message Queuing system that was originally developed from LinkedIn as the basis for the active stream (Activitystream) and Operational Data processing pipeline (Pipeline) for LinkedIn. It has now been used by several different types of companies as multiple types of data pipelines and messaging systems.

1 Kafka Message Queuing Introduction 1.1 basic terms

Broker
A Kafka cluster contains one or more servers, which are called broker[5]
Topic
Each message published to the Kafka Cluster has a category, which is called topic. (Physically different topic messages are stored separately, logically a topic message is saved on one or more brokers but the user only needs to specify the topic of the message to produce or consume data without worrying about where the data is stored)
Partition
Partition is a physical concept, and each topic contains one or more partition. (typically the total number of cores for a Kafka node)
Producer
Responsible for publishing messages to Kafka broker
Consumer
The message consumer, the client that reads the message to Kafka broker.
Consumer Group
Each consumer belongs to a specific consumer group (the group name can be specified for each consumer, and the default group if the group name is not specified).

1.2 Message Queuing 1.2.1 basic features

Can be extended
Capacity expansion without the need for downline
Data flow Partitioning (partition) is stored on multiple machines
Performance
A single broker can serve thousands of clients
Single broker Reads/writes per second up to hundreds of megabytes per second
A cluster of multiple brokers will achieve very high throughput capacity
Stable performance, no matter how big the data
Kafka at the bottom of the Java heap cache mechanism, the use of operating system-level page caching, while the random write operations into sequential write, combined with the zero-copy features greatly improved IO performance.
Persistent storage
Stored on disk
Redundant backups to other servers to prevent loss

1.2.2 Message Format

A topic corresponds to a message format, so messages are classified by topic
A topic represents a message that consists of 1 or more patition (s)
a partition.
One partition should be stored on one or more servers

One server for leader
Other servers for followers
Leader need to accept read and write requests
Followers for redundant backups only
Leader failure, will automatically elect a follower as leader to ensure uninterrupted service
Each server may play some partitions leader and other partitions follower roles, so that the entire cluster will achieve load balancing effect

If there is only one server, there is no redundant backup, it is a single machine instead of a cluster
If you have more than one server

Messages are stored sequentially in order that the message can only be appended, the message cannot be inserted, each message has an offset, which is used as the message ID, and the only offset in a partition is consumer saved and managed, so the reading order is actually completely consumer determined. Messages that are not necessarily linear have a time-out date and expire Delete 1.2.3 producer producer

Producer writing messages to Kafka
Write to specify topic and partition
How the message is divided into different partition, the algorithm is specified by producer

1.2.4 Consumer Consumer

Consumer reading messages and processing them
Consumer group
- can be processed concurrently according to the number of partition
- Each partition has only one consumer read, thus guaranteeing that the order in which the messages are processed is carried out in the order of partition, noting that the order is affected by the algorithm of the producer holding the message.
- This concept was introduced to support two scenarios: Each message was distributed to a consumer, and each message was broadcast to all consumers
- Multiple consumer group subscribes to a topic, the TOPCI message is broadcast to all consumer group
- Once a message is sent to a consumer group, it can only be received and used by one consumer of the group
- Each consumer in a group corresponds to a partition to provide the following benefits
A consumer can have more than one thread to consume, and the number of threads should be no more than the partition number of topic , because for a consumer group that contains one or more consuming threads, A partition can only be allocated to one of the consumer threads and allow as many threads as possible to allocate to partition (although the number of threads and threads actually going to consume is still determined by the thread pool's scheduling mechanism). This way, if the number of threads is more than partition, then the single-shot allocation will also have more threads, they will not consume to any one partition of data and idle consumption of resources.
If consumer reads data from multiple partition and does not guarantee the order of the data, Kafka only guarantees that the data is ordered on a partition, but multiple partition, depending on the order in which you read them, will be different.
Increase or decrease consumer,broker,partition will cause rebalance, so consumer corresponding partition will change after rebalance

This article is from the "Technical Achievement Dream" blog, please be sure to keep this source http://andyliu.blog.51cto.com/518879/1967307

Ubuntu16.04 Installing the Kafka cluster

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More