Https://www.ibm.com/developerworks/cn/opensource/os-cn-kafka/index.htmlKafka and Flume Many of the functions are really repetitive. Here are some suggestions for evaluating the two systems:
Kafka is a general-purpose system. You can have many producers and consumers to share multiple themes. Conversely, Flume
1. Create a Agent,sink type to be specified as a custom sinkVi/usr/local/flume/conf/agent3.confAgent3.sources=as1Agent3.channels=c1Agent3.sinks=s1Agent3.sources.as1.type=avroagent3.sources.as1.bind=0.0.0.0agent3.sources.as1.port=41414Agent3.sources.as1.channels=c1Agent3.channels.c1.type=memoryAgent3.sinks.s1.type=storm.test.kafka.testkafkasinkAgent3.sinks.s1.channel=c12. Create custom Kafka Sink (custom
Flume: Used to collect logs and transfer logs to KAKFAKafka: As a cache, store logs from FlumeES: As a storage medium, store logsLogstash: True filtering of logsFlume deploymentGet the installation package, unzip1 wget http://10.80.7.177/install_package/apache-flume-1.7.0-bin.tar.gz tar ZXF apache-flume-1.7.0-bin.tar.gz-c/usr/local/Modify the flumen-env.sh scri
1.Jdkthe installationrefer to the installation of the JDK here. 2.installationZookeeperrefer to my The "Fully distributed" section of the Zookeeper installation tutorial. 3.installationKafkarefer to my The "Fully distributed Build" section of the Kafka installation tutorial. 4.installationFlumerefer to my Flume Installation Tutorial. 5.ConfigurationFlume5.1. ConfigurationKafka-s.cfg$ cd/software/
First of all, Flume and Kafka are message systems , but they also have a lot of different places, flume more toward the message acquisition system, and Kafka more toward the message cache system. The difference in "one" designFlume is a message acquisition system, which mainly solves the problem is the multiple collec
Today's meeting to discuss why log processing uses both Flume and Kafka, is it possible to use only Kafka without Flume? The idea was to use only the Flume interface, whether it is the input interface (socket and file) and the output interface (
Deployment Readiness
Configure the Log collection system (FLUME+KAFKA), version:
apache-flume-1.8.0-bin.tar.gz
kafka_2.11-0.10.2.0.tgz
Suppose the Ubuntu system environment is deployed in three working nodes:
192.168.0.2
192.168.0.3
192.168.0.4Flume Configuration Instructions
Suppose Flume's working directory is in/usr/local/
Pre-Preparation
Elk Official Website: https://www.elastic.co/, package download and perfect documentation.
Zookeeper Official website: https://zookeeper.apache.org/
Kafka official website: http://kafka.apache.org/documentation.html, package download and perfect documentation.
Flume Official website: https://flume.apache.org/
Heka Official website: https://hekad.readthedocs.io/en/v0.10.0/
The system is a ce
scribe, Chukwa, Kafka, flume log System comparison1. Background informationMany of the company's platforms generate a large number of logs per day (typically streaming data, such as search engine PV, queries, etc.), processing these logs requires a specific logging system, in general, these systems need to have the following characteristics: (1) Build the bridge of application system and analysis system, an
the high-level interface, which hides the details of the broker, allowing consumer to push data from the broker without having to care about the network topology.
More importantly, for most log systems, the data information that consumer has acquired is saved by the broker, while in Kafka, the data information is maintained by consumer itself.
Cloudera's Flume
1. Background introduction Many of the company's platforms generate a large number of logs per day (typically streaming data, for example, the search engine PV, query, etc.), the processing of these logs requires a specific log system, in general, these systems need to have the following characteristics: (1) The construction of application systems and analysis systems of the bridge, and the correlation between them decoupling (2) support for near real-time online analysis system and off-line ana
Liaoliang Teacher's course: The 2016 big Data spark "mushroom cloud" action spark streaming consumption flume collected Kafka data DIRECTF way job.First, the basic backgroundSpark-streaming get Kafka data in two ways receiver and direct way, this article describes the way of direct. The specific process is this:1, direct mode is directly connected to the
A scheme of log acquisition architecture based on Flume+log4j+kafkaThis article will show you how to use Flume, log4j, Kafka for the specification of log capture.Flume Basic ConceptsFlume is a perfect, powerful log collection tool, about its configuration, on the internet there are many examples and information available, here only to do a simple explanation is n
the service exception.3. Send data输入4. View the data file to view the/tmp/log/flume directory file:Integration with KafkaFlume can be flexibly integrated with Kafka, Flume focuses on data collection, and Kafka focuses on data distribution. The flume can be configured with a
, Memoryrecoverchannel, FileChannel. Memorychannel can achieve high-speed throughput, but cannot guarantee the integrity of the data. Memoryrecoverchannel has been built to replace the official documentation with FileChannel. FileChannel guarantees the integrity and consistency of the data. When configuring FileChannel specifically, it is recommended that the directory and program log files that you set up FileChannel be saved to a different disk for increased efficiency.Sink when setting up sto
Big Data We all know about Hadoop, but not all of Hadoop. How do we build a large database project. For offline processing, Hadoop is still more appropriate, but for real-time and relatively strong, data volume is relatively large, we can use storm, then storm and what technology collocation, in order to do a suitable for their own projects.1. What are the characteristics of a good project architecture?2. How does the project structure ensure the accuracy of the data?3. What is
-round.
3 Implementing the Architecture
A schema implementation architecture is shown in the following figure:
Analysis of 3.1 producer layer
The service assumptions within the PAAs platform are deployed within the Docker container, so in order to meet the non-functional requirements, another process is responsible for collecting logs and therefore does not invade the service framework and processes. Using flume ng for log collection, this open s
First, the demand
Use flume to capture the file information under Linux and pass it into the Kafka cluster.
Environment ready Zookeeper cluster and Kafka cluster are installed well.
Second, the configuration flume
Download Flume website. The blogge
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.