Ubuntu16.04 Installing the Kafka cluster

Source: Internet
Author: User
Tags zookeeper git clone

Download

Http://kafka.apache.org/downloads.html

Http://mirror.bit.edu.cn/apache/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/config# vim server.properties

broker.id=2 each node is different

log.retention.hours=168

message.max.byte=5242880

default.replication.factor=2

replica.fetch.max.bytes=5242880

zookeeper.connect=master:2181,slave1:2181,slave2:2181

Copy to another node

Note To create the/kafka node in ZK beforehand, the error will be reported: Java.lang.IllegalArgumentException:Path length must be > 0

[Email protected]:/usr/local/zookeeper-3.4.9# bin/zkcli.sh-server Master

[Zk:master (CONNECTED) 7] Create/kafka "

Created/kafka

[Zk:master (CONNECTED) 8] LS/

[Cluster, controller, Controller_epoch, brokers, zookeeper, Kafka, admin, Isr_change_notification, consumers, latest_ Producer_id_block, Config]

[Zk:master (CONNECTED) 9] Ls/kafka

[]

Start Kafka in daemon mode

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# nohup bin/kafka-server-start.sh Config/server.properties &

Create topic:

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--create--zookeeper master:2181-- Replication-factor 1--partitions 1--topic test

Created topic "Test".

List all topic:

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--list--zookeeper master:2181

Test

Send Message

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-console-producer.sh--broker-list master:9092--topic Test

>this is a message

>this is ant^h message

Consumer News

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-console-consumer.sh--zookeeper master:2181--topic test --from-beginning

Using the Consoleconsumer with old consumer is deprecated and'll be removed in a future major release. Consider using the new consumer by passing [Bootstrap-server] instead of [zookeeper].

This is a message

This is an message

View cluster status information

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# bin/kafka-topics.sh--describe--zookeeper slave1:2181--topic My-replicated-topic

Topic:my-replicated-topic partitioncount:1 replicationfactor:3 configs:

Topic:my-replicated-topic partition:0 leader:3 replicas:1,3,2 isr:3,2

Installing Kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# git clone https://github.com/yahoo/kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager# cd kafka-manager/

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager#./SBT Clean Dist

[Success] Total time:3453 S, completed 7, 8:48:15 PM

A packaged file exists

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager/target/universal# ls

Kafka-manager-1.3.3.12.zip tmp

Modifying the Kafka-manager configuration file

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# vim conf/application.conf

Kafka-manager.zkhosts= "192.168.117.243:2181,192.168.117.45:2181,192.168.117.242:2181"

Start Kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# bin/kafka-manager-dconfig.file=conf/ Application.conf

Recommended starting mode:

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# nohup bin/kafka-manager-dconfig.file= conf/application.conf-dhttp.port=7778 &

Login Kafka Manager:

http://192.168.117.243:7778/

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# Netstat-antlup | grep 7778

TCP6 0 0::: 7778:::* LISTEN 100620/java

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# bin/kafka-manager-dconfig.file=conf/ Application.conf

This application is already running (Or delete/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12/running_pid file) .

Stop Kafka-manager

[Email protected]:/usr/local/kafka_2.11-0.11.0.0/kafka-manager-1.3.3.12# rm running_pid

[Email protected]:/usr/local/kafka_2.11-0.11.0.0# cd kafka-manager-1.0-snapshot/

Production server Configuration

# Replication Configurations

Num.replica.fetchers=4

replica.fetch.max.bytes=1048576

replica.fetch.wait.max.ms=500

replica.high.watermark.checkpoint.interval.ms=5000

replica.socket.timeout.ms=30000

replica.socket.receive.buffer.bytes=65536

replica.lag.time.max.ms=10000

replica.lag.max.messages=4000

controller.socket.timeout.ms=30000

controller.message.queue.size=10

# LOG Configuration

Num.partitions=8

message.max.bytes=1000000

Auto.create.topics.enable=true

log.index.interval.bytes=4096

log.index.size.max.bytes=10485760

log.retention.hours=168

log.flush.interval.ms=10000

log.flush.interval.messages=20000

log.flush.scheduler.interval.ms=2000

log.roll.hours=168

log.retention.check.interval.ms=300000

log.segment.bytes=1073741824

# ZK Configuration

zookeeper.connection.timeout.ms=6000

zookeeper.sync.time.ms=2000

# Socket Server Configuration

Num.io.threads=8

Num.network.threads=8

socket.request.max.bytes=104857600

socket.receive.buffer.bytes=1048576

socket.send.buffer.bytes=1048576

Queued.max.requests=16

fetch.purgatory.purge.interval.requests=100

producer.purgatory.purge.interval.requests=100

Kafka is a high-throughput distributed subscription-based Message Queuing system that was originally developed from LinkedIn as the basis for the active stream (Activitystream) and Operational Data processing pipeline (Pipeline) for LinkedIn. It has now been used by several different types of companies as multiple types of data pipelines and messaging systems.

1 Kafka Message Queuing Introduction 1.1 basic terms
    • Broker
      A Kafka cluster contains one or more servers, which are called broker[5]

    • Topic
      Each message published to the Kafka Cluster has a category, which is called topic. (Physically different topic messages are stored separately, logically a topic message is saved on one or more brokers but the user only needs to specify the topic of the message to produce or consume data without worrying about where the data is stored)

    • Partition
      Partition is a physical concept, and each topic contains one or more partition. (typically the total number of cores for a Kafka node)

    • Producer
      Responsible for publishing messages to Kafka broker

    • Consumer
      The message consumer, the client that reads the message to Kafka broker.

    • Consumer Group
      Each consumer belongs to a specific consumer group (the group name can be specified for each consumer, and the default group if the group name is not specified).

1.2 Message Queuing 1.2.1 basic features
    1. Can be extended

    2. Capacity expansion without the need for downline

    3. Data flow Partitioning (partition) is stored on multiple machines

    4. Performance

    5. A single broker can serve thousands of clients

    6. Single broker Reads/writes per second up to hundreds of megabytes per second

    7. A cluster of multiple brokers will achieve very high throughput capacity

    8. Stable performance, no matter how big the data

    9. Kafka at the bottom of the Java heap cache mechanism, the use of operating system-level page caching, while the random write operations into sequential write, combined with the zero-copy features greatly improved IO performance.

    10. Persistent storage

    11. Stored on disk

    12. Redundant backups to other servers to prevent loss

1.2.2 Message Format
    1. A topic corresponds to a message format, so messages are classified by topic

    2. A topic represents a message that consists of 1 or more patition (s)

    3. a partition.

    4. One partition should be stored on one or more servers

    • One server for leader

    • Other servers for followers

    • Leader need to accept read and write requests

    • Followers for redundant backups only

    • Leader failure, will automatically elect a follower as leader to ensure uninterrupted service

    • Each server may play some partitions leader and other partitions follower roles, so that the entire cluster will achieve load balancing effect

    • If there is only one server, there is no redundant backup, it is a single machine instead of a cluster

    • If you have more than one server


Messages are stored sequentially in order that the message can only be appended, the message cannot be inserted, each message has an offset, which is used as the message ID, and the only offset in a partition is consumer saved and managed, so the reading order is actually completely consumer determined. Messages that are not necessarily linear have a time-out date and expire Delete 1.2.3 producer producer
    • Producer writing messages to Kafka

    • Write to specify topic and partition

    • How the message is divided into different partition, the algorithm is specified by producer

1.2.4 Consumer Consumer
  • Consumer reading messages and processing them

  • Consumer group


    • can be processed concurrently according to the number of partition

    • Each partition has only one consumer read, thus guaranteeing that the order in which the messages are processed is carried out in the order of partition, noting that the order is affected by the algorithm of the producer holding the message.

    • This concept was introduced to support two scenarios: Each message was distributed to a consumer, and each message was broadcast to all consumers

    • Multiple consumer group subscribes to a topic, the TOPCI message is broadcast to all consumer group

    • Once a message is sent to a consumer group, it can only be received and used by one consumer of the group

    • Each consumer in a group corresponds to a partition to provide the following benefits


  • A consumer can have more than one thread to consume, and the number of threads should be no more than the partition number of topic , because for a consumer group that contains one or more consuming threads, A partition can only be allocated to one of the consumer threads and allow as many threads as possible to allocate to partition (although the number of threads and threads actually going to consume is still determined by the thread pool's scheduling mechanism). This way, if the number of threads is more than partition, then the single-shot allocation will also have more threads, they will not consume to any one partition of data and idle consumption of resources.

  • If consumer reads data from multiple partition and does not guarantee the order of the data, Kafka only guarantees that the data is ordered on a partition, but multiple partition, depending on the order in which you read them, will be different.

  • Increase or decrease consumer,broker,partition will cause rebalance, so consumer corresponding partition will change after rebalance


This article is from the "Technical Achievement Dream" blog, please be sure to keep this source http://andyliu.blog.51cto.com/518879/1967307

Ubuntu16.04 Installing the Kafka cluster

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.