Overview
Flume: A distributed, reliable, and usable service for efficiently collecting, aggregating, and moving large-scale log data
We build a flume + Spark streaming platform to get data from flume and process it.
There are two ways to do this: Use the push-based method of Flume-style, or use a custo
entire data transfer process, the event is flowing. The transaction guarantee is at the event level. Flume can support multi-level flume agent, support fan-in (fan-in), fan-out (fan-out).Second, the Environment preparation1) Hadoop cluster (landlord version 2.7.3, a total of 6 nodes, can refer to http://www.cnblogs.com/qq503665965/p/6790580.html)2) Flume cluster
http://flume.apache.org/install 1, upload 2, unzip 3, modify the JDK directory in the conf/flume-env.sh file Note: java_opts configuration If we transfer the file too large reported memory overflow need to modify this configuration item 4, Verify that the installation was successful./flume-ng VERSION5, configuring environment variables export Flume_home=/home/apache-flu
The data source used in the previous article is to take data from a socket, a bit belonging to the "Heterodoxy", serious is from the Kafka and other message queue to take the data!The main supported source, learned by the official website are as follows: The form of data acquisition includes push push and pull pullsfirst, spark streaming integration Flume The way of 1.pushMore recommended is the pull method. Introduce dependencies: Dependency
Flume mainly by the following types of monitoring methods:JMX Monitoring
JMX High detonation can modify the JAVA_OPTS environment variables in the flume-env.sh file as follows:
Export java_opts= "-dcom.sun.management.jmxremote-dcom.sun.management.jmxremote.port=5445- Dcom.sun.management.jmxremote.authenticate=false-dcom.sun.management.jmxremote.ssl=false "
Ganglia monitoring
How to do integration, in fact, especially simple, online is actually a tutorial.http://blog.csdn.net/fighting_one_piece/article/details/40667035 look here.I'm using the first integration. When you do, there are a variety of problems. Probably from from 2014.12.17 5 o'clock in the morning to 2014.12.17 night 18 o'clock 30 summed up in fact very simple, but do a long time AH Ah!!! This kind of thing, a fall into your wit. Question 1, need to refer to a variety of packages, these packages to bre
Today's meeting to discuss why log processing uses both Flume and Kafka, is it possible to use only Kafka without Flume? The idea was to use only the Flume interface, whether it is the input interface (socket and file) and the output interface (kafka/hdfs/hbase, etc.).Consider a single scenario, and from a simplified system perspective, it might be better to use
Centos6.5 install flume, centos6.5flume
Flume is installed here because it is used for game Business Log collection and analysis.
1. Install the java environmentRpm-ivh jdk-8u51-linux-x64.rpmPreparing... ######################################## ### [100%]1: jdk1.8.0 _ 51 ##################################### ###### [100%]Unpacking JAR files...Rt. jar...Jsse. jar...Charsets. jar...Tools. jar...Localedata. ja
From: http://flume.apache.org/FlumeUserGuide.html#data-flow-model
Learn flume through translation.Introduction
Apache flume is a distributed, highly reliable, and highly available system. It is mainly used to efficiently collect, aggregate, and move a large amount of log data from various data sources.
The collected data is stored in a centralized manner.
The application scenarios of Apache
Deployment Readiness
Configure the Log collection system (FLUME+KAFKA), version:
apache-flume-1.8.0-bin.tar.gz
kafka_2.11-0.10.2.0.tgz
Suppose the Ubuntu system environment is deployed in three working nodes:
192.168.0.2
192.168.0.3
192.168.0.4Flume Configuration Instructions
Suppose Flume's working directory is in/usr/local/flume,Monitor a log file (such as/tmp
Transferred from: http://www.aboutyun.com/thread-7884-1-1.html
Questions Guide:1. How to implement the Flume end to customize a sink, to follow our rules to save the log.2. To get the value of RootPath from the flume configuration file, how to configure it.Recently you need to use Flume to do the collection of remote logs, so learn some
Chapter 2 wtl wizardry wtl Samples
Wtl wizardry
The original article uses the wtl Appwizard of vc6 and earlier versions, which is a simple introduction and can be easily applied in practice. In addition, the latest wtl8.1 (wtl8.1 released in)) Interface Options are slightly different from this, this part is omitted.
Wtl Samples
To better understand wtl, we provide three examples.ProgramGuidgen, mtp
The function of this class is to split the content in the file by line and insert the content into the column1 and column2 columns respectively. The rowKey is the current time. Flume-
The function of this class is to split the content in the file by line and insert the content into the column1 and column2 columns respectively. The rowKey is the current time. Flume-
This article introduces
Flume, as a real-time log collection system developed by Cloudera, has been recognized and widely used by the industry. The initial release version of Flume is now collectively known as Flume OG (original Generation), which belongs to Cloudera. But with the expansion of the FLume function,
Forwarded from the Mad BlogHttp://www.cnblogs.com/lxf20061900/p/3866252.htmlSpark Streaming is a new real-time computing tool, and it's fast growing. It converts the input stream into a dstream into an rdd, which can be handled using spark. It directly supports a variety of data sources: Kafka, Flume, Twitter, ZeroMQ, TCP sockets, etc., there are functions that can be manipulated:,,, map reduce joinwindow等。This article will connect spark streaming and
Flume Learning application: Write log data to MongoDB and flumemongodb in JavaOverview
Windows: Java writes logs to Flume, and Flume writes the logs to MongoDB. System Environment
Operating System: win7 64
JDK: 1.6.0 _ 43
Download Resources
Maven: 3.3.3Download, install, and get started: 1. Maven-start and 2. Create a simple Maven Project
OverviewApache Flume is a distributed, reliable, and available system. Ability to efficiently collect, summarize and move large amounts of log data from many different sources, one centralized data store.The use of Apache's flume is not limited to log data aggregation. Since the data source is customizable, flume can be used for a large number of events (each row
This article is a self-summary of learning, used for later review. If you have any mistake, don't hesitate to enlighten me.Here are some of the contents of the blog: http://blog.csdn.net/ymh198816/article/details/51998085Flume+kafka+storm+redis Real-time Analysis system basic Architecture1) The architecture of the entire real-time analysis system is2) The Order log is generated by the order server of the e-commerce system first,3) Then use Flume to li
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.