"original statement" This article belongs to the author original, has authorized Infoq Chinese station first, reproduced please must be marked at the beginning of the article from "Jason's Blog", and attached the original link http://www.jasongj.com/2015/06/08/KafkaColumn3/SummaryIn this paper, based on the previous article, the HA mechanism of Kafka is explained in detail, and various ha related scenarios such as broker Failover,controller Failover,t
).Kafka is an explicit distributed system. It assumes that data producers, brokers, and consumers are scattered on multiple machines.In contrast, traditional message queues cannot be well supported (for example, ultra-long unprocessed data cannot be effectively persisted ). Kafka provides two guarantees for Data availability:(1 ).Messages sent by the producer to
1.7
Now that Kafka is ready and running, you can create a topic to store messages. We can also generate or use data from Java/Scala code or directly from the command line.
Now create a topic named "test" and open a new command line in f: \ kafka_2.11-0.9.0.1 \ bin \ windows. Enter the following command and press Enter:
kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitio
--zookeeper localhost:2181TestIn addition to manually creating topic, you can also configure the broker to have it automatically create topic.Step 4: Send a message.Kafka uses a simple command-line producer to read the message from the file or from the standard input and send it to the server. A message is sent by default for each command.Run producer and lose some messages in the console that will be sent
)
Partition:partition is a physical concept of partitioning, in order to provide system throughput, each topic is physically divided into one or more Partition, each Partition corresponding to a folder (store the message content and index file of the corresponding partition).
Producer: Message producer, responsible for producing messages and sending them to Kafk
Background:Various Application Systems in today's society, such as business, social networking, search, and browsing, constantly produce information like information factories. In The Big Data era, we are faced with the following challenges:
How to collect this huge information
How to analyze it
How to implement the above two points in a timely manner
These challenges form a business demand model, that is, information about producer production (pr
Kafka Learning (1) configuration and simple command usage, kafka learning configuration command1. Introduction to related concepts in Kafka
Kafka is a distributed message middleware implemented by scala. The related concepts are as follows:
The content transmitted in Kafka
From: http://doc.okbase.net/QING____/archive/19447.htmlAlso refer to:http://blog.csdn.net/21aspnet/article/details/19325373Http://blog.csdn.net/unix21/article/details/18990123Kafka as a distributed log collection or system monitoring service, it is necessary for us to use it in a suitable situation. The deployment of Kafka includes the Zookeeper environment/kafka environment, along with some configuration o
(50 MB) per second and process 0.55 million messages (110 MB) per second ).
Supports persistent operations. Persistent messages to disks can be used for batch consumption, such as ETL and real-time applications. Data can be persisted to the hard disk and replicated to prevent data loss.
Distributed System, easy to scale out. All producer, broker, and consumer are distributed. Machines can be expanded without downtime.
The status of the message
SummaryIn this paper, based on the previous article, the HA mechanism of Kafka is explained in detail, and various ha related scenarios such as broker Failover,controller Failover,topic creation/deletion, broker initiating, Follower a detailed process from leader fetch data. It also introduces the replication related tools provided by Kafka, such as redistribution partition, etc.Broker failover process cont
-8u73-linux-x64.tar.gz and decompress it to/usr/local/jdk.
Open the/etc/profile file.
[root@localhost ~]# vim /etc/profile
Write the following code into the file.
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_73export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jarexport PATH=$JAVA_HOME/bin:$PATH
Last
[root@localhost ~]# source /etc/profile
The jdk takes effect now. You can use java-version for verification.
Ii. Install Kafka
1. Download
building a good and robust real-time data processing system is not an article that can be made clear. Before reading this article, assume that you have a basic understanding of the Apache Kafka distributed messaging system and that you can use the Spark streaming API for simple programming. Next, let's take a look at how to build a simple real-time data processing system.About KafkaKafka is a distributed, high-throughput, easy-to-expand, topic-based
; bin/kafka-server-start.sh config/server.properties3. Create Topic
Create a topic named "Test" with only one partition and only one copy:
> bin/kafka-create-topic.sh--zookeeper nutch1:2181--replica 1--partition 1--topic testTo run the list topic command, you can see the topic listing
> bin/kafka-list-topic.sh--zookeeper nutch1:21814. Send a message
multiple times, and of course many of the details are configurableBulk Send: Kafka supports batch sending in message collection to improve push efficiency.Kafka the relationship between broker in a cluster: not a master-slave relationship, where each broker is in a cluster, we can add or remove any broker node at will.The partitioning mechanism Partition:kafka the broker side of the message partition, producer
side does not maintain the consumption status of the data and improves performance. Direct disk storage, linear read and write, fast: avoids duplication of data between the JVM's memory and system memory, and reduces the consumption of performance-creating objects and garbage collection.
2) Producer
Responsible for publishing messages to Kafka broke
3) Consumer
The message consumer, the client that reads t
overhead storage.
Kafka has a very simple storage layout.
1. Each partition of a topic corresponds to a logical log.
Physically, a log is implemented as a set of segment files of approximately the same size (e.g., 1 GB ).
Every time a producer publishes a message to a partition, the broker simply appends the message to the last segment file.
A message is only exposed to the consumers after it is flushed.
2
data and convert data into a structured log. stored in the data store (can be database or HDFS, etc.).
4. LinkedIn's Kafka
Kafka is the December 2010 Open source project, using Scala language, the use of a variety of efficiency optimization mechanisms, the overall architecture is relatively novel (push/pull), more suitable for heterogeneous clusters.
Design objectives:
(1) The access cost of data on disk i
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.