If you read Kafka for the first time, read the distributed message system Kafka preliminary
Some people have asked the difference between Kafka and general MQ, which is difficult to answer. I think it is better to analyze the implementation principles of Kafka, based on the design provided on the official website, this
Stop Kafka service:kafka_2.12-0.10.2.1> bin/kafka-server-stop.shkafka_2.12-0.10.2.1> bin/ Zookeeper-server-stop.shstep 1: Download Kafka download the latest version and unzip .>Tar-xzf kafka_2.12-0.10.2.1.tgz> CD Kafka_2.12-0.10.2.1step 2: Start the service Kafka used to zookeeper, all first start Zookper, the followin
each disk continuous read and write characteristics.
On a specific configuration, you configure multiple directories for different disks into broker's log.dirs, for exampleLog.dirs=/disk1/kafka-logs,/disk2/kafka-logs,/disk3/kafka-logsKafka will distribute the new partition to the least partition directory when the new partition is created, so it is generally not
through Kafka servers and consumer clusters.
Supports Hadoop parallel data loading.
Key Features
Publish and subscribe to the message flow, which is similar to Message Queuing, which is why Kafka is categorized as a Message Queuing framework
Record message flows in a fault-tolerant manner, Kafka store message flows as files
Can be proce
queue is full, the data (messages) is discarded and a queuefullexceptions exception is thrown. For the producer of blocking mode, if the internal queue is full, it will wait, thus effectively control the internal consumer consumption speed. You can open producer's Trace logging and view the remaining amount of the internal queue at any time. If the internal queue of the producer is full for a long time, this means that for mirror-maker, pushing the message back to the target
design constraint is throughput, not functionality.
State information about what data has been used is saved as part of the data consumer (consumer) instead of being stored on the server.
Kafka is an explicit distributed system. It assumes that data producers (producer), proxies (brokers), and data consumers (consumer) are scattered over multiple machines.
Architecture:
Kafka is actua
published, the Kafka client constructs a message that joins the message into the message set set (Kafka supports bulk publishing, can add multiple messages to the message collection, and a row is published), and the client needs to specify the topic to which the message belongs when the Send message is sent.When subscribing to a message, the Kafka client needs t
referenced.Prior to this, for virtualized Kafka, you would first need to execute the following command to enter the container:Kubectl exec-it [Kafka's pod name]/bin/bashAfter entering the container, the Kafka command is stored in the Opt/kafka/bin directory and entered with the CD command:CD Opt/kafka/binThe following
Kafka provides two sets of APIs to consumer
The high-level Consumer API
The Simpleconsumer API
the first highly abstracted consumer API, which is simple and convenient to use, but for some special needs we might want to use the second, lower-level API, so let's start by describing what the second API can do to help us do it .
One message read multiple times
Consume only a subset of the messages in a process partition
mapreduce jobs built into it, which are used to get data and convert data into a structured log. stored in the data store (can be database or HDFS, etc.).
4. LinkedIn's Kafka
Kafka is the December 2010 Open source project, using Scala language, the use of a variety of efficiency optimization mechanisms, the overall architecture is relatively novel (push/pull), more suitable for heterogeneous clusters.
Desi
Kafka 0.9 version of the Java Client API made a large adjustment, this article mainly summarizes the Kafka 0.9 in the cluster construction, high availability, the new API related processes and details, as well as I in the installation and commissioning process to step out of the various pits.About Kafka structure, function, characteristics, application scenarios,
, which are used to obtain data and convert data to a structured log. stored in the data store (either a database or HDFS, etc.).4. LinkedIn's KafkaKafka is a December 2010 Open source project, written in the Scala language, using a variety of efficiency optimization mechanisms, the overall architecture is relatively new (push/pull), more suitable for heterogeneous clusters.Design goal:(1) The cost of data access on disk is O (1)(2) High throughput rate, hundreds of thousands of messages per sec
1. File System Description
File systems are generally divided into two types: system and user. System-level file systems: ext3, ext4, DFS, NTFS, etc ,, I will not introduce the complicated distributed or system-level file system,
The architecture design of the Kafka file system is deeply analyzed from the perspective of the high performance of the Kafka architecture.
2.
1. Start the Zookeeper server./zookeeper-server-start.sh/opt/cx/kafka_2.11-0.9.0.1/config/zookeeper.properties2. Modify the Broker-1,broker-2 configurationbroker.id=1listeners=plaintext://:9093 # The port the socket server listens onport=9093log.dirs=/opt/cx/kafka/ Kafka-logs-1broker.id=2listeners=plaintext://:9094# th
The first part constructs the Kafka environment
Install Kafka
Download: http://kafka.apache.org/downloads.html
Tar zxf kafka-
Start Zookeeper
You need to configure config/zookeeper.properties before starting zookeeper:
Next, start zookeeper.
Bin/zookeeper-server-start.sh config/zookeeper.properties
Start Kafka Serv
following configuration is added:
ticktime=2000
datadir=/usr/software/zookeeper/data
clientport=2181
initlimit=5
syncLimit=2
b) Start the Zookeeper server
JMX enabled by default
Using config:/usr/software/zookeeper/zookeeper/bin/. /conf/zoo.cfg
Starting zookeeper ... STARTED
3. Start Kafka
You can start the server by giving the following command-
$./bin/kafka-server-start.sh Config/server.pr
Kafka is a messaging component in a distributed environment, and Kafka message components cannot be used if Kafka application processes are killed or Kafka machines are down.
Kafka Cluster (cluster)
A machine is not enough, then more than a few, first of all, start zookeepe
data from the pagecache kernel cache to the NIC buffer? The sendfile system function does this. Obviously, this will greatly improve the efficiency of data transmission. In Java, the corresponding function call is
FileChannle.transferTo
In addition, Kafka further improves the throughput by compressing, transmitting, and accessing multiple data entries.The consumption status is maintained by the consumer.
The consumption status of
.
• Components that provide services can be added, deleted, and changed, and should also be supported during runtime.
The client that accesses the service should not care about the implementation details of the service. Solution:
Introduce a broker component to decouple the client and server side. The server registers itself to broker, allowing the client to access the service by exposing the interface. Th
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.