Label:Training Big Data architecture development, mining and analysis! From zero-based to advanced, one-to-one training! [Technical qq:2937765541] --------------------------------------------------------------------------------------------------------------- ---------------------------- Course System: get video material and training answer technical support address Course Presentation ( Big Data technology is very wide, has been online for you training solutions!) ): get video material and tr
Training Big Data architecture development, mining and analysis!from zero-based to advanced, one-to-one technical training! Full Technical guidance! [Technical qq:2937765541] https://item.taobao.com/item.htm?id=535950178794-------------------------------------------------------------------------------------Java Internet Architect Training!https://item.taobao.com/item.htm?id=536055176638Big Data Architecture Development Mining Analytics Hadoop HBase
combined freely. The combination method is very flexible based on the configuration file set by the user. For example: Channel can put events in memory, or can be persisted to the local hard drive. Sink can write the log to HDFs, HBase, or even another source, and so on. Flume supports users to build multilevel streams, which means that multiple agents can work together and support Fan-in, fan-out, contextual Routing, Backup Routes, which is also NB.
A1.sources =r1a1.sinks=K1a1.channels=c1# Describe/Configure the Sourcea1.sources.r1.type=Netcata1.sources.r1.bind=Localhosta1.sources.r1.port= 44444a1.sources.r1.interceptors=i1 i2 i3 a1.sources.r1.interceptors.i1.type=timestamp A1.sources.r1.interceptors.i2.type=host A1.sources.r1.interceptors.i3.type=StaticA1.sources.r1.interceptors.i3.key=Datacenter A1.sources.r1.interceptors.i3.value=new_york# Describe the Sinka1.sinks.k1.type=logger# use a channel which buffers events in Memorya1.channels.c
like this:Info:sourcing Environment Configuration script/etc/flume-ng/conf/flume-env.shInfo:including Hadoop libraries found via (/usr/bin/hadoop) for HDFS accessInfo:excluding/usr/lib/hadoop/lib/slf4j-api-1.7.5.jar from ClasspathInfo:excluding/usr/lib/
agent data flow configuration is where to get the data, send the data to which collector.
For collector is the data sent by the receiving agent, sending the data to the specified target machine.
Note: The flume framework relies on Hadoop and zookeeper only on the jar package and does not require that the Hadoop and zookeeper services be started when the
data to the specified target machine.
Note: The flume framework only depends on hadoop and zookeeper in the jar package, and does not require the hadoop and zookeeper services to be started when flume is started.
Iii. Flume distributed environment deployment 1. Experiment
.
For the agent data flow configuration is where to get the data, send the data to which collector.
For collector is the data sent by the receiving agent, sending the data to the specified target machine.
Note: The flume framework relies on Hadoop and zookeeper only on the jar package and does not require that the Hadoop and zookeeper services b
Is Flume a good fit for your problem?If you need to ingest textual log data into Hadoop/hdfs then Flume are the right fit for your problem, full stop. For other use cases, here is some guidelines:Flume is designed to transport and ingestregularly-generatedeventdataoverrelativelystable,potentiallycomplextopologies. Thenotionof "Eventdata" isverybroadlydefined.to
1, source is HTTP mode, sink is logger mode, the data is printed in the console. The conf configuration file is as follows: # Name The components in this agenta1.sources = R1a1.sinks = K1a1.channels = c1# Describe/configure the S Ourcea1.sources.r1.type = http #该设置表示接收通过http方式发送过来的数据a1. sources.r1.bind = hadoop-master # The host or IP address running flume can be a1.sources.r1.port = 9000# Port #a1.sources.
agent data flow configuration, where to get the data, the data is sent to which collector.For collector, the data sent by the agent is received, and then the data is sent to the specified target machine.Note: The flume framework's dependency on Hadoop and zookeeper only exists on the jar package, and does not require the flume to start at the same time as the
Introduction to IBM biginsights Flume
Flume is an open source mass log collection system that supports real-time collection of logs. The initial flume version was Flume OG (flume original Generation), developed by Cloudera company, called Cloudera
can tolerate the loss of configuration information in the event of machine failure. So the stability of OG's exercise is dependent on zookeeper.3, NG version of the user requirements greatly reduced: installation process In addition to Java without the need to configure complex flume related attributes, also do not need to build zookeeper cluster, installation process almost 0 workload.Some people are puzzled, how suddenly came out of a zookeeper thi
browser to access the Web page to view the resulting log$ tail-f/var/log/httpd/access_log, after multiple visits:(Browser open: 192.168.122.200, according to its own configuration IP access can be.) )STEP6, copy the jar of Hadoop that the Flume relies on to its own Lib directorycp/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/share/
). Allows applications to interact directly with existing source, such as Avrosource,Syslogtcpsource. If the built-in source does not meet your needs, Flume also supports customizing the source. Source Type: 3.3. ChannelThe channel is a component that connects source and sink, which can be viewed as a buffer of data (a data queue) that can be staged into memory or persisted to a local disk, directlyThe event is processed to sink. Two more commonly u
guarantees the reliability and security of the data transmission.Iii. installing Hadoop and flume My experiment was performed on HDP 2.5.0, and Flume was included in the HDP installation as long as the flume service was configured. Installation steps for HDP see "HAWQ Technical Analysis (ii)-Installation Deployment"Iv
The collection of user behavior data is undoubtedly a prerequisite for building a referral system, and the Flume project under the Apache Foundation is tailored for distributed log collection, this is the 1th of the Flume research note, which mainly introduces Flume's basic architecture, The next note will illustrate the deployment and use steps of flume with an
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.