Apache KAFKA cluster Environment Environment building

Last Update:2018-07-26 Source: Internet

Author: User

Tags config garbage collection zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

http://bigcat2013.iteye.com/blog/2175880

Apache Kafka is a high-throughput distributed messaging system, open source by LinkedIn. Referring to Kafka's introduction to the official website: "Apache Kafka is publish-subscribe messaging rethought as a distributed commit log." Publish-subscribe "is the core idea of Kafka design, and also the most distinctive place of Kafka. Publish in KAKFA is a producer role, subscribe is consumer, as in our lives, manufacturers produce products, consumers generally can not go directly to the factory to buy, but also need an agent dealer, So the same in the Kafka ecosystem, there is a broker role. So the Kafka ecosystem can be broadly described as follows:

"Producer-->broker<--consumer"

The general introduction is so much, specific people can go to the official website: http://kafka.apache.org/

Then there is the commonplace question: why use Kafka. What kind of scene does the Kafka apply to? I first share with you the use of the project in the summary, there are other ideas of students welcome to add: The use of Kafka reasons:

1. Distributed, high throughput, fast (Kafka is directly through disk storage, linear read and write, fast: Avoid the data between the JVM memory and system memory replication, reduce the consumption of performance of object creation and garbage collection)

2. Support both real-time and offline two solutions (I believe many projects have similar needs, this is the official LinkedIn structure, we are part of the data through storm to do real-time computing processing, part of the Hadoop offline analysis).

3.open Source (open source who doesn't like it)

4. The source code is written by Scala and can be run on the JVM (the author is very fond of Scala, the functional language has always been very handsome, and Spark is written by Scala, it seems that there will be a time to brush Scala) use the scene:

The author is mainly used to do log analysis system, in fact, LinkedIn is so used, may be because the Kafka reliability requirements are not particularly high, in addition to the log, some of the site's browsing data should also apply. (As long as the original data does not need to be stored directly in db)

The following is a brief introduction to the Kafka cluster construction process:

Prep environment: At least 3 Linux servers (the author is a 5 redhat version of cloud server)

First step: Install Jdk/jre

Step Two: Install Zookeeper (Kafka comes with zookeeper service, but it is recommended that you build a zookeeper cluster separately, which can be shared with other applications and manageable)

Zookeeper installation, you can refer to my other blog post: http://bigcat2013.iteye.com/blog/2175538

Step three: Download kafka:http://kafka.apache.org/downloads.html (It's a good idea to download the Scala pre-compiled package, for example, I'm under Kafka_2.10-0.8.1.1.tgz, It means pre-compiled with Scala 2.10 0.8.1.1 version)

Fourth step: Upload the installation package to the server (can be via WINSCP, etc.)

Fifth step: Use " tar-xzvf kafka_2.10-0.8.1.1.tgz " to unzip the installation package:

The directory structure after decompression:

Sixth Step : Modify the configuration file

If you have a simple configuration, you can change/config/server.properties.

The properties that need to be configured are: Broker.id (indicates the current server ID in the cluster, starting at 0), Port,host.name (current server host name), Zookeeper.connect (connected zookeeper cluster) , Log.dirs (log storage directory, remember the corresponding to build this directory), and other configuration can see the corresponding comments:

Seventh step: Copy the configured Kafka directory to several other servers via "Scp-r":

Eighth step: Modify each server corresponding to the configuration file, mainly to modify the Broker.id and Host.name properties:

Broker.id increments from 0, each server must be unique

Nineth Step: Start the Zookeeper cluster first, then start the KAKFA cluster

Kafka Start command: sudo nohup./bin/kafka-server-start.sh config/server.properties &

Tenth step: After the cluster starts successfully, you can try to create a topic, create a producer on one server, another create a consumer, send a message from producer to see if the consumer can be received to verify that the cluster is successful.

Create Topic:sudo./bin/kafka-topics.sh-zookeeper server1:2181,server2:2181,server3:2181-topic Test-replication-factor 2-partitions 5-create

View Topic:sudo./bin/kafka-topics.sh-zookeeper server1:2181,server2:2181,server3:2181-list

Create Producer:sudo./bin/kafka-console-producer.sh-broker-list Kafkaserver1:9092,kafkaserver2:9092,kafkaserver3 : 9092-topic Test

Create Consumer:sudo./bin/kafka-console-consumer.sh-zookeeper server1:2181,server2:2181,server3:2181-from-begining- Topic test

By entering information in the created producer console, testing the output in the consumer console, if you can receive the information synchronously, the simple KAKFA cluster is set up, and it can be further configured according to the actual needs of the project.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More