Introduced
Kafka is a distributed, partitioned, replicable messaging system. It provides the functionality of a common messaging system, but has its own unique design. What does this unique design look like?
Let's first look at a few basic messaging system terms:
Kafka the message to topic as a unit.• The program that will release the message to
ongoing example application that demonstrates the purpose of Kafka as a messaging server. This example applies the full source code on GitHub. A detailed discussion of it is in the last section of this document.ArchitectureFirst, let me introduce the basic concepts of Kafka. Its architecture consists of the following components:
topic (
side does not maintain the consumption status of the data and improves performance. Direct disk storage, linear read and write, fast: avoids duplication of data between the JVM's memory and system memory, and reduces the consumption of performance-creating objects and garbage collection.
2) Producer
Responsible for publishing messages to Kafka broke
3) Consumer
The message consumer, the client that reads the message to
of data sent by thousands of clients per second.
Scalability: A single cluster can be used as a big data processing hub to centrally process various types of businesses
Persistence: messages are persistently stored on disks (Tb-level data can be processed, but the data processing efficiency remains extremely high), and the backup fault tolerance mechanism is available.
Distributed: focuses on the big data field and supports distributed processing. clusters can process millions of messages pe
ongoing example application that demonstrates the purpose of Kafka as a messaging server. This example applies the full source code on GitHub. A detailed discussion of it is in the last section of this document.ArchitectureFirst, let me introduce the basic concepts of Kafka. Its architecture consists of the following components:
Topic (
.
And then open/etc/profile file
[Root@localhost ~]# Vim/etc/profile
Write the following code into the file.
Export JAVA_HOME=/USR/LOCAL/JDK/JDK1. 8 . 0_73export CLASSPATH=.: $JAVA _home/lib/tools.jar: $JAVA _home/lib/dt.jarexport PATH= $JAVA _home/ Bin: $PATH
At last
[Root@localhost ~]# Source/etc/profile
The JDK is now in effect and can be verified with java-version.
Two. Next install the Kafka
1. Download Kafka
version first, and then consider optimizing later" "this requirement is very simple. How can we achieve it? I will do it tomorrow", however .. There is no time to sort out and think. Projects are always in a hurry, and programmers are always working overtime... Previous Code always depends on the next bug...Let's get back to the question.1. Establish the Kafka EnvironmentThere are a lot of tutorial examples for building environments on the Internet.
Reference Site:https://github.com/yahoo/kafka-managerFirst, the function
Managing multiple Kafka clusters
Convenient check Kafka cluster status (topics,brokers, backup distribution, partition distribution)
Select the copy you want to run
Based on the current partition status
You can choose Topic
To start the Kafka service:
bin/kafka-server-start.sh Config/server.properties
To stop the Kafka service:
bin/kafka-server-stop.sh
Create topic:
bin/kafka-topics.sh--create--zookeeper hadoop002.local:2181,hadoop001.local:
building a good and robust real-time data processing system is not an article that can be made clear. Before reading this article, assume that you have a basic understanding of the Apache Kafka distributed messaging system and that you can use the Spark streaming API for simple programming. Next, let's take a look at how to build a simple real-time data processing system.About KafkaKafka is a distributed, high-throughput, easy-to-expand,
[TOC]
Kafka Note Finishing (ii): Kafka Java API usageThe following test code uses the following topic:$ kafka-topics.sh --describe hadoop --zookeeper uplooking01:2181,uplooking02:2181,uplooking03:2181Topic:hadoop PartitionCount:3 ReplicationFactor:3 Configs:
Kafka principleKafka is a messaging system that was originally developed from LinkedIn as the basis for the activity stream of LinkedIn and the Operational Data Processing pipeline (Pipeline). It has now been used by several companies as multiple types of data pipelines and messaging systems. Activity flow data is the most common part of data that almost all sites use to make reports about their site usage. Activity data includes content such as page
applies the full source code on GitHub. A detailed discussion of it is in the last section of this document.SchemaTopic (TOPIC) is a specific type of message flow. The message is a payload of bytes (Payload), and the topic is the name of the category or seed (Feed) of the message.A producer (Producer) is any object that can publish a message to a topic.Published messages are saved in a set of servers, whic
-8u73-linux-x64.tar.gz and decompress it to/usr/local/jdk.
Open the/etc/profile file.
[root@localhost ~]# vim /etc/profile
Write the following code into the file.
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_73export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jarexport PATH=$JAVA_HOME/bin:$PATH
Last
[root@localhost ~]# source /etc/profile
The jdk takes effect now. You can use java-version for verification.
Ii. Install Kafka
1. Download
active data and offline processing systems. The communication between the client and the server is based on a simple, high-performance TCP protocol unrelated to programming languages.3. Several Basic concepts:
Topic: refers to the different types of message sources processed by Kafka.
Partition: Physical grouping of a topic. A
the basis for the activity stream of LinkedIn and the Operational Data Processing pipeline (Pipeline). It has now been used by several companies as multiple types of data pipelines and messaging systems. Activity flow data is the most common part of data that almost all sites use to make reports about their site usage. Activity data includes content such as page views, information about the content being viewed, and search conditions. This data is typically handled by writing various activities
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.