Download
Http://kafka.apache.org/downloads.html
Http://mirror.bit.edu.cn/apache/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/config# vim server.properties
broker.id=2 each node is different
log.retention.hours=168
message.max.byte=5242880
default.replication.factor=2
replica.fetch.max.bytes=5242880
zookeeper.connect=master:2181,slave1:2181,slave2:2181
Copy to another node
Note To create the/kafka node in ZK beforehand, the error will be reported: Java.lang.IllegalArgumentException:Path length must be > 0
[Email protected]:/usr/local/zookeeper-3.4.9# bin/zkcli.sh-server Master
[Zk:master (CONNECTED) 7] Create/kafka "
Created/kafka
[Zk:master (CONNECTED) 8] LS/
[Cluster, controller, Controller_epoch, brokers, zookeeper, Kafka, admin, Isr_change_notification, consumers, latest_ Producer_id_block, Config]
[Zk:master (CONNECTED) 9] Ls/kafka
[]
Start Kafka in daemon mode
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# nohup bin/kafka-server-start.sh Config/server.properties &
Create topic:
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--create--zookeeper master:2181-- Replication-factor 1--partitions 1--topic test
Created topic "Test".
List all topic:
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--list--zookeeper master:2181
Test
Send Message
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-console-producer.sh--broker-list master:9092--topic Test
>this is a message
>this is ant^h message
Consumer News
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-console-consumer.sh--zookeeper master:2181--topic test --from-beginning
Using the Consoleconsumer with old consumer is deprecated and'll be removed in a future major release. Consider using the new consumer by passing [Bootstrap-server] instead of [zookeeper].
This is a message
This is an message
View cluster status information
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--describe--zookeeper slave1:2181--topic My-replicated-topic
Topic:my-replicated-topic partitioncount:1 replicationfactor:3 configs:
Topic:my-replicated-topic partition:0 leader:3 replicas:1,3,2 isr:3,2
Installing Kafka-manager
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# git clone https://github.com/yahoo/kafka-manager
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager# cd kafka-manager/
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager#./SBT Clean Dist
[Success] Total time:3453 S, completed 7, 8:48:15 PM
A packaged file exists
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager/target/universal# ls
Kafka-manager-1.3.3.12.zip tmp
Modifying the Kafka-manager configuration file
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# vim conf/application.conf
Kafka-manager.zkhosts= "192.168.117.243:2181,192.168.117.45:2181,192.168.117.242:2181"
Start Kafka-manager
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# bin/kafka-manager-dconfig.file=conf/ Application.conf
Recommended starting mode:
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# nohup bin/kafka-manager-dconfig.file= conf/application.conf-dhttp.port=7778 &
Login Kafka Manager:
http://192.168.117.243:7778/
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# Netstat-antlup | grep 7778
TCP6 0 0::: 7778:::* LISTEN 100620/java
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# bin/kafka-manager-dconfig.file=conf/ Application.conf
This application is already running (Or delete/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12/running_pid file) .
Stop Kafka-manager
[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# rm running_pid
[Email protected]:/usr/local/kafka_2.11-0.11.0.0# cd kafka-manager-1.0-snapshot/
Production server Configuration
# Replication Configurations
Num.replica.fetchers=4
replica.fetch.max.bytes=1048576
replica.fetch.wait.max.ms=500
replica.high.watermark.checkpoint.interval.ms=5000
replica.socket.timeout.ms=30000
replica.socket.receive.buffer.bytes=65536
replica.lag.time.max.ms=10000
replica.lag.max.messages=4000
controller.socket.timeout.ms=30000
controller.message.queue.size=10
# LOG Configuration
Num.partitions=8
message.max.bytes=1000000
Auto.create.topics.enable=true
log.index.interval.bytes=4096
log.index.size.max.bytes=10485760
log.retention.hours=168
log.flush.interval.ms=10000
log.flush.interval.messages=20000
log.flush.scheduler.interval.ms=2000
log.roll.hours=168
log.retention.check.interval.ms=300000
log.segment.bytes=1073741824
# ZK Configuration
zookeeper.connection.timeout.ms=6000
zookeeper.sync.time.ms=2000
# Socket Server Configuration
Num.io.threads=8
Num.network.threads=8
socket.request.max.bytes=104857600
socket.receive.buffer.bytes=1048576
socket.send.buffer.bytes=1048576
Queued.max.requests=16
fetch.purgatory.purge.interval.requests=100
producer.purgatory.purge.interval.requests=100
Kafka is a high-throughput distributed subscription-based Message Queuing system that was originally developed from LinkedIn as the basis for the active stream (Activitystream) and Operational Data processing pipeline (Pipeline) for LinkedIn. It has now been used by several different types of companies as multiple types of data pipelines and messaging systems.
1 Kafka Message Queuing Introduction 1.1 basic terms
Broker
A Kafka cluster contains one or more servers, which are called broker[5]
Topic
Each message published to the Kafka Cluster has a category, which is called topic. (Physically different topic messages are stored separately, logically a topic message is saved on one or more brokers but the user only needs to specify the topic of the message to produce or consume data without worrying about where the data is stored)
Partition
Partition is a physical concept, and each topic contains one or more partition. (typically the total number of cores for a Kafka node)
Producer
Responsible for publishing messages to Kafka broker
Consumer
The message consumer, the client that reads the message to Kafka broker.
Consumer Group
Each consumer belongs to a specific consumer group (the group name can be specified for each consumer, and the default group if the group name is not specified).
1.2 Message Queuing 1.2.1 basic features
Can be extended
Capacity expansion without the need for downline
Data flow Partitioning (partition) is stored on multiple machines
Performance
A single broker can serve thousands of clients
Single broker Reads/writes per second up to hundreds of megabytes per second
A cluster of multiple brokers will achieve very high throughput capacity
Stable performance, no matter how big the data
Kafka at the bottom of the Java heap cache mechanism, the use of operating system-level page caching, while the random write operations into sequential write, combined with the zero-copy features greatly improved IO performance.
Persistent storage
Stored on disk
Redundant backups to other servers to prevent loss
1.2.2 Message Format
A topic corresponds to a message format, so messages are classified by topic
A topic represents a message that consists of 1 or more patition (s)
a partition.
One partition should be stored on one or more servers
One server for leader
Other servers for followers
Leader need to accept read and write requests
Followers for redundant backups only
Leader failure, will automatically elect a follower as leader to ensure uninterrupted service
Each server may play some partitions leader and other partitions follower roles, so that the entire cluster will achieve load balancing effect
If there is only one server, there is no redundant backup, it is a single machine instead of a cluster
If you have more than one server
Messages are stored sequentially in order that the message can only be appended, the message cannot be inserted, each message has an offset, which is used as the message ID, and the only offset in a partition is consumer saved and managed, so the reading order is actually completely consumer determined. Messages that are not necessarily linear have a time-out date and expire Delete 1.2.3 producer producer
Producer writing messages to Kafka
Write to specify topic and partition
How the message is divided into different partition, the algorithm is specified by producer
1.2.4 Consumer Consumer
Consumer reading messages and processing them
Consumer group
can be processed concurrently according to the number of partition
Each partition has only one consumer read, thus guaranteeing that the order in which the messages are processed is carried out in the order of partition, noting that the order is affected by the algorithm of the producer holding the message.
This concept was introduced to support two scenarios: Each message was distributed to a consumer, and each message was broadcast to all consumers
Multiple consumer group subscribes to a topic, the TOPCI message is broadcast to all consumer group
Once a message is sent to a consumer group, it can only be received and used by one consumer of the group
Each consumer in a group corresponds to a partition to provide the following benefits
A consumer can have more than one thread to consume, and the number of threads should be no more than the partition number of topic , because for a consumer group that contains one or more consuming threads, A partition can only be allocated to one of the consumer threads and allow as many threads as possible to allocate to partition (although the number of threads and threads actually going to consume is still determined by the thread pool's scheduling mechanism). This way, if the number of threads is more than partition, then the single-shot allocation will also have more threads, they will not consume to any one partition of data and idle consumption of resources.
If consumer reads data from multiple partition and does not guarantee the order of the data, Kafka only guarantees that the data is ordered on a partition, but multiple partition, depending on the order in which you read them, will be different.
Increase or decrease consumer,broker,partition will cause rebalance, so consumer corresponding partition will change after rebalance
This article is from the "Technical Achievement Dream" blog, please be sure to keep this source http://andyliu.blog.51cto.com/518879/1967307
Ubuntu16.04 Installing the Kafka cluster