I. OverviewThe spring integration Kafka is based on the Apache Kafka and spring integration to integrate KAFKA, which facilitates development configuration.Second, the configuration1, Spring-kafka-consumer.xml 2, Spring-kafka-producer.xml 3, Send Message interface Kafkaserv
I. Kafka INTRODUCTION
Kafka is a distributed publish-Subscribe messaging System . Originally developed by LinkedIn, it was written in the Scala language and later became part of the Apache project. Kafka is a distributed, partitioned, multi-subscriber, redundant backup of the persistent log service . It is mainly used for the processing of active streaming data
information is shown in the following paragraph.
* Required Parameters **/public final brokerhosts hosts;
/** * The topic queue name to be read from Kafka * required parameter **/public final String topic; /** * Kafka Client ID parameter, this parameter generally does not need to set the * default value to Kafka.api.O
I. Kafka INTRODUCTIONKafka is a distributed publish-subscribe messaging system. Originally developed by LinkedIn, it was written in the Scala language and later became part of the Apache project. Kafka is a distributed, partitioned, multi-subscriber, redundant backup of the persistent log service. It is mainly used for the processing of active streaming data (real-time computing).In big Data system, often e
existing applications or data systems. For example, connect to a relational database.
In Kafka, the communication between the client and the server is simple, high-performance, and based on the TCP protocol.
Topics and Logs
Kafka providesA stream of records -- the topic
A topic is a classification and a record is pub
December 2010, written in Scala, with Push/pull architecture, which is more suitable for the transfer of heterogeneous cluster data.
Kafka Features
Persistent message: No information is lost, providing stable terabytes of message storageHigh throughput: Kafka design works on commercial hardware, providing millions of messages per secondDistributed architecture, capable of partitioning messagesReal
configuring the Server.properties file, speaking zookeeper.connect modifying the IP and port of the standalone cluster
zookeeper.connect=nutch1:2181
Copy Code(2) Create a topic
> bin/kafka-create-topic.sh--zookeeper localhost:2181--replica 1--partition 1--topic test
> bin/kafka-list-topic.sh
representation as a Kafka Cluster, and the above architecture diagram is relatively detailed;Kafka version: 0.8.0Kafka download and Documentation: HTTP://KAFKA.APACHE.ORG/KAFKA installation:
> Tar xzf kafka-
> CD kafka-
>./SBT Update
>./SBT Package
the architecture is just the Kafka concise representation as a Kafka Cluster, and the above architecture diagram is relatively detailed;Kafka version: 0.8.0Kafka download and Documentation: HTTP://KAFKA.APACHE.ORG/KAFKA installation:
> Tar xzf kafka-
> CD
on, the reliability of the step-by-step analysis, and finally through the benchmark to enhance the knowledge of Kafka high reliability.
2 Kafka Architecture
As shown in the figure above, a typical Kafka architecture consists of several producer (which can be server logs, business data, page view generated at the front of the pages, and so on), a number of br
server.properties-rw-r--r--. 1 root 3325 Feb 08:37 test-log4j.properties-rw-r--r--. 1 root 1032 Feb 08:37 tools-log4j.properties-rw-r--r--. 1 root 1023 Feb 08:37 zookeeper.propertiesTo modify a configuration file:
Broker.id=0 #当前机器在集群中的唯一标识, like the myid nature of zookeeper port=19092 #当前kafka对外提供服务的端口默认是9092 host.name=192.168.7.100
This parameter is closed by default, and there is a bug,dns resolution problem in 0.8.1, failure rate problem. num.n
What's Kafka?
Kafka, originally developed by LinkedIn, is a distributed, partitioned, multiple-copy, multiple-subscriber, zookeeper-coordinated distributed logging system (also known as an MQ system), commonly used for Web/nginx logs, access logs, messaging services, and so on, LinkedIn contributed to the Apache Foundation in 2010 and became the top open source project.
1. Foreword
A commercial message queu
caching, which is the cache between active data and offline processing systems. Client and server-side communication is based on a simple, high-performance, and programming language-independent TCP protocol. Several basic concepts:
Topic: Refers specifically to different classifications of Kafka processed message sources (feeds of messages).
Partition:topic A physical grouping, a
The first part constructs the Kafka environment
Install Kafka
Download: http://kafka.apache.org/downloads.html
Tar zxf kafka-
Start Zookeeper
You need to configure config/zookeeper.properties before starting zookeeper:
Next, start zookeeper.
Bin/zookeeper-server-start.sh config/zookeeper.properties
Start Kafka Serv
Kafka Learning Road (ii)--improve the message sending process because Kafka is inherently distributed , a Kafka cluster typically consists of multiple agents. to balance the load, divide the topic into multiple partitions , each agent stores one or more partitions . multiple producers and consumers can produce and get
multiple times, and of course many of the details are configurableBulk Send: Kafka supports batch sending in message collection to improve push efficiency.Kafka the relationship between broker in a cluster: not a master-slave relationship, where each broker is in a cluster, we can add or remove any broker node at will.The partitioning mechanism Partition:kafka the broker side of the message partition, producer can decide which partition to send the m
differences between Directstream and stream are described in more detail below. We create a Kafkasparkdemomain class, the code is as follows, there is a detailed comment in the code, there is no more explanation:
1
2
3
4
5
6
7
8
9
30 of each of the above. The all-in-a
-
$
50
Package Com.winwill.spark Import kafka.serializer.StringDecoder import org.apache.spark.SparkConf Import Org.apache.spark.streaming.dstream. {DStream, Inputdstream} import org.apache.spark.streaming. {Durat
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.