High-throughput distributed publishing and subscription message system Kafka

Source: Internet
Author: User

High-throughput distributed publishing and subscription message system Kafka

I. Overview of Kafka

Kafka is a high-throughput distributed publish/subscribe message system that can process all the action flow data of a website with a consumer scale. Such actions (Web browsing, search, and other user actions) are a key factor in many social functions on modern networks. This data is usually solved by processing logs and log aggregation due to throughput requirements. This is a feasible solution for log data and offline analysis systems like Hadoop that require real-time processing. Kafka aims to unify online and offline message processing through the parallel loading mechanism of Hadoop, and also to provide real-time consumption through cluster machines.

Kafka architecture design of the distributed publish/subscribe message system

Apache Kafka code example

Apache Kafka tutorial notes

Principles and features of Apache kafka (0.8 V)

Kafka deployment and code instance

Introduction to Kafka and establishment of Cluster Environment

Ii. terms related to Kafka

  • A BrokerKafka cluster contains one or more servers called brokers.
  • Each message published to the Kafka cluster of a Topic has a category called Topic. (Messages of different topics are stored separately physically, messages of a Topic in logic are stored on one or more brokers, but you only need to specify the Topic of the message to produce or consume data without worrying about where the data is stored)
  • PartitionPartition is a physical concept. Each Topic contains one or more partitions.
  • The Producer is responsible for publishing messages to the Kafka broker.
  • Consumer, the client that reads messages from the Kafka broker.
  • Consumer Group each Consumer belongs to a specific Consumer Group (you can specify a group name for each Consumer. If no group name is specified, it belongs to the default group ).

Ii. Download and install Kafka

1. Download

1 wget http://apache.fayea.com/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz

2. Installation

12 tar zxvf kafka_2.11-0.9.0.1.tgzcd kafka_2.11-0.9.0.1

3. cluster configuration

Two servers are configured: 192.168.1.237 and 192.168.1.238. Each server is installed with two zookeeper and the port is 2181 (zookeeper is not described). Each server is configured with three brokers for Kafka.

3.1 server. properties configuration

123456 broker.id = 10port = 9090host.name=192.168.1.237advertised.host.name=192.168.1.237log.dirs=/tmp/kafka-logs/server0zookeeper.connect=192.168.1.237:2181,192.168.1.238:2181

Note: The host. name \ advertised. host. name parameters must be configured as IP addresses. Otherwise, various problems may occur.

3.2 configure server1.properties

1 cp config/servier.properties config/server1.properties<br>vim config/server1.properties
?
123456 broker.id = 11port = 9091host.name=192.168.1.237advertised.host.name=192.168.1.237log.dirs=/tmp/kafka-logs/server1zookeeper.connect=192.168.1.237:2181,192.168.1.238:2181

3.3 configure server2.properties

12 cp config/servier.properties config/server2.propertiesvim config/server2.properties
?
123456 broker.id = 12port = 9092host.name=192.168.1.237advertised.host.name=192.168.1.237log.dirs=/tmp/kafka-logs/server2zookeeper.connect=192.168.1.237:2181,192.168.1.238:2181

Note: The same server port and log. dirs cannot be the same. Different server broker. IDS cannot be the same in a cluster.

3.4. Similarly, the server of another server. properties, server1.properties, server2.properties broker. the IDS are 20, 21, and 22 respectively. The ports are 9090, 9091, and 9092. name = 192.168.1.238, advertised. host. name = 192.168.1.238

3.5 start

123 bin/kafka-server-start.sh config/server.properties &bin/kafka-server-start.sh config/server1.properties &bin/kafka-server-start.sh config/server2.properties &

3.6 Monitoring Port

1234 netstat -tunpl |grep 2181netstat -tunpl |grep 9090netstat -tunpl |grep 9091netstat -tunpl |grep 9092

Check whether the four ports are available. Check whether iptables is enabled when these four IP addresses are added or whether iptables is related. Otherwise, the JAVA connection fails.

Iv. Test

4.1 create a Topic

1 bin/kafka-topics.sh --create --zookeeper 192.168.1.237:2181 --replication-factor 3 --partitions 1 --topic testTopic

4.2 view creation status

?
1 bin/kafka-topics.sh --describe --zookeeper 192.168.1.237:2181 --topic testTopic

4.3 producer sends messages

1 bin/kafka-console-producer.sh --broker-list 192.168.1.237:9090 --topic testTopic

4.4 messages are received for consumption

1 bin/kafka-console-consumer.sh --zookeeper 192.168.1.237:2181 --from-beginning --topic testTopic

4.5 check the consumer offset position

1 bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zkconnect 192.168.1.237:2181 --group testTopic

5. Problems Encountered

1. An error is reported during running for a period of time.

123456 ## There is insufficient memory for the Java Runtime Environment to continue.# Native memory allocation (malloc) failed to allocate 986513408 bytes for committing reserved memory.# An error report file with more information is saved as:# //hs_err_pid6500.logOpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000bad30000, 986513408, 0) failed; error='Cannot allocate memory' (errno=12)

Solution:

You can adjust the JVM heap size by editingkafka-server-start.sh,zookeeper-server-start.shAnd so on:

1 export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"

The-XmsParameter specifies the minimum heap size. to get your server to at least start up, try changing it to use less memory. given that you only have 512 M, you shoshould change the maximum heap size (-Xmx) Too:

1 export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M"

For more details, please continue to read the highlights on the next page:

  • 1
  • 2
  • 3
  • Next Page
[Content navigation]
Page 1: installation and testing Page 2nd: spring-integration-kafka applications
Page 7: management tool Kafka Manager

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.