Flume collecting logs, writing to HDFs

Source: Internet
Author: User


First install the Flume:

It is recommended to maintain a unified user with Hadoop to install Hadoop,flume

This time I use Hadoop user installation flume

http://douya.blog.51cto.com/6173221/1860390


To start the configuration:

1, configuration file Writing:

Vim flume_hdfs.conf

# Define A memory channel called CH1 on Agent1

Agent1.channels.ch1.type = Memory

Agent1.channels.ch1.capacity = 10000

agent1.channels.ch1.transactionCapacity = 100

#agent1. channels.ch1.keep-alive = 30



#define Source Monitor A file

Agent1.sources.avro-source1.type = Exec

Agent1.sources.avro-source1.shell =/bin/bash-c

Agent1.sources.avro-source1.command = Tail-n +0-f/root/logs/appcenter.log

Agent1.sources.avro-source1.channels = Ch1

Agent1.sources.avro-source1.threads = 5


Agent1.sources.avro-source1.interceptors = I1

Agent1.sources.avro-source1.interceptors.i1.type = Org.apache.flume.interceptor.timestampinterceptor$builder


# Define A logger sink that simply logs all events it receives

# and connect it to the other end of the same channel.

Agent1.sinks.log-sink1.channel = Ch1

Agent1.sinks.log-sink1.type = HDFs

Agent1.sinks.log-sink1.hdfs.path = hdfs://ns1/flume/%y%m%d

Agent1.sinks.log-sink1.hdfs.writeformat = events-

Agent1.sinks.log-sink1.hdfs.filetype = DataStream

Agent1.sinks.log-sink1.hdfs.rollinterval = 60

Agent1.sinks.log-sink1.hdfs.rollsize = 134217728

Agent1.sinks.log-sink1.hdfs.rollcount = 0

#agent1. sinks.log-sink1.hdfs.batchsize = 100000

#agent1. Sinks.log-sink1.hdfs.txneventmax = 100000

#agent1. Sinks.log-sink1.hdfs.calltimeout = 60000

#agent1. Sinks.log-sink1.hdfs.appendtimeout = 60000


# Finally, Now that we've defined all the components, tell

# agent1 which ones we want to Activate.

Agent1.channels = Ch1

Agent1.sources = Avro-source1

Agent1.sinks = Log-sink1



2, provided that the Hadoop cluster starts normally and is available,


3, start flume, Start collecting logs


Start:

Flume-ng agent-c/usr/local/elk/apache-flume/conf/-f/usr/local/elk/apache-flume/conf/flume_hdfs.conf- Dflume.root.logger=info,console-n Agent1

Error one:

2016-12-06 11:24:49,036 (sinkrunner-pollingrunner-defaultsinkprocessor) [INFO- Org.apache.flume.sink.hdfs.BucketWriter.open (bucketwriter.java:234)] Creating hdfs://ns1/flume// Flumedata.1480994688831.tmp

2016-12-06 11:24:49,190 (sinkrunner-pollingrunner-defaultsinkprocessor) [ERROR- Org.apache.flume.sink.hdfs.HDFSEventSink.process (hdfseventsink.java:459)] Process failed

Java.lang.noclassdeffounderror:org/apache/hadoop/util/platformname

At Org.apache.hadoop.security.UserGroupInformation.getOSLoginModuleName (usergroupinformation.java:366)

At Org.apache.hadoop.security.usergroupinformation.<clinit> (usergroupinformation.java:411)

At Org.apache.hadoop.fs.filesystem$cache$key.<init> (FILESYSTEM.JAVA:2828)

At Org.apache.hadoop.fs.filesystem$cache$key.<init> (filesystem.java:2818)

At Org.apache.hadoop.fs.filesystem$cache.get (filesystem.java:2684)

At Org.apache.hadoop.fs.FileSystem.get (filesystem.java:373)

At Org.apache.hadoop.fs.Path.getFileSystem (path.java:295)

At Org.apache.flume.sink.hdfs.bucketwriter$1.call (bucketwriter.java:243)

At Org.apache.flume.sink.hdfs.bucketwriter$1.call (bucketwriter.java:235)

At Org.apache.flume.sink.hdfs.bucketwriter$9$1.run (bucketwriter.java:679)

At Org.apache.flume.auth.SimpleAuthenticator.execute (simpleauthenticator.java:50)

At Org.apache.flume.sink.hdfs.bucketwriter$9.call (bucketwriter.java:676)

At Java.util.concurrent.FutureTask.run (futuretask.java:266)

At Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1142)

At Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:617)

At Java.lang.Thread.run (thread.java:745)

caused By:java.lang.ClassNotFoundException:org.apache.hadoop.util.PlatformName

At Java.net.URLClassLoader.findClass (urlclassloader.java:381)

At Java.lang.ClassLoader.loadClass (classloader.java:424)

At Sun.misc.launcher$appclassloader.loadclass (launcher.java:331)

At Java.lang.ClassLoader.loadClass (classloader.java:357)

... More

Solve:

1,cd/usr/local/hadoop/hadoop/share/hadoop/common/

CP Hadoop-common-2.7.3.jar

2, Cd/usr/local/hadoop/hadoop/share/hadoop/common/lib

Hadoop-auth-2.7.3.jar

Copy this jar package from the Lib directory to the Flume client's Lib directory

But there will be a lot of errors, so lazy practice, the lib under all the jar package copy to the Flume side of the Lib


Reason: because flume to write data to the person hdfs, so need to rely on some HDFS package




Start:

Flume-ng agent-c/usr/local/elk/apache-flume/conf/-f/usr/local/elk/apache-flume/conf/flume_hdfs.conf- Dflume.root.logger=info,console-n Agent1

Error two:

2016-12-06 11:36:52,791 (sinkrunner-pollingrunner-defaultsinkprocessor) [INFO- Org.apache.flume.sink.hdfs.BucketWriter.open (bucketwriter.java:234)] Creating hdfs://ns1/flume// Flumedata.1480995177465.tmp

2016-12-06 11:36:52,793 (sinkrunner-pollingrunner-defaultsinkprocessor) [WARN- Org.apache.flume.sink.hdfs.HDFSEventSink.process (hdfseventsink.java:455)] HDFs IO Error

Java.io.IOException:No FileSystem for Scheme:hdfs

At Org.apache.hadoop.fs.FileSystem.getFileSystemClass (filesystem.java:2660)

At Org.apache.hadoop.fs.FileSystem.createFileSystem (filesystem.java:2667)

At org.apache.hadoop.fs.filesystem.access$200 (filesystem.java:94)

At Org.apache.hadoop.fs.filesystem$cache.getinternal (filesystem.java:2703)

At Org.apache.hadoop.fs.filesystem$cache.get (filesystem.java:2685)

At Org.apache.hadoop.fs.FileSystem.get (filesystem.java:373)

At Org.apache.hadoop.fs.Path.getFileSystem (path.java:295)

At Org.apache.flume.sink.hdfs.bucketwriter$1.call (bucketwriter.java:243)

At Org.apache.flume.sink.hdfs.bucketwriter$1.call (bucketwriter.java:235)

At Org.apache.flume.sink.hdfs.bucketwriter$9$1.run (bucketwriter.java:679)

At Org.apache.flume.auth.SimpleAuthenticator.execute (simpleauthenticator.java:50)

At Org.apache.flume.sink.hdfs.bucketwriter$9.call (bucketwriter.java:676)

At Java.util.concurrent.FutureTask.run (futuretask.java:266)

At Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1142)

At Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:617)

At Java.lang.Thread.run (thread.java:745)

2016-12-06 11:36:57,798 (sinkrunner-pollingrunner-defaultsinkprocessor) [INFO- Org.apache.flume.sink.hdfs.BucketWriter.open (bucketwriter.java:234)] Creating hdfs://ns1/flume// Flumedata.1480995177466.tmp

2016-12-06 11:36:57,799 (sinkrunner-pollingrunner-defaultsinkprocessor) [WARN- Org.apache.flume.sink.hdfs.HDFSEventSink.process (hdfseventsink.java:455)] HDFs IO Error

Java.io.IOException:No FileSystem for Scheme:hdfs


analysis, at which point the flume side does not recognize the ns1 of Hadoop


Workaround:

Cd/usr/local/hadoop/hadoop/etc/hadoop

Copy the Core-site.xml Hdfs-site.xml to the Conf at the Flume end

Machines that bind to Hadoop machines

Vim/etc/hosts


172.16.9.250 HADOOP1

172.16.9.252 HADOOP2

172.16.9.253 HADOOP3


There's a hadoop-hdfs-2.7.3.jar at this Point.

https://www.versioneye.com/java/org.apache.hadoop:hadoop-hdfs/2.7.3


Upload to the Flume Lib directory

Successfully started again

Flume-ng agent-c/usr/local/elk/apache-flume/conf/-f/usr/local/elk/apache-flume/conf/flume_hdfs.conf- Dflume.root.logger=info,console-n Agent1

Info:sourcing Environment Configuration script/usr/local/elk/apache-flume/conf/flume-env.sh

Info:including hive Libraries found via () for hive access

+ EXEC/SOFT/JDK1.8.0_101/BIN/JAVA-XMX20M-DFLUME.ROOT.LOGGER=INFO,CONSOLE-CP '/usr/local/elk/apache-flume/conf:/ usr/local/elk/apache-flume/lib/*:/usr/local/elk/apache-flume:/lib/* '-djava.library.path= Org.apache.flume.node.application-f/usr/local/elk/apache-flume/conf/flume_hdfs.conf-n Agent1

Slf4j:class path contains multiple slf4j bindings.

Slf4j:found Binding in [jar:file:/usr/local/elk/apache-flume/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/ staticloggerbinder.class]

Slf4j:found Binding in [jar:file:/usr/local/elk/apache-flume/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/ staticloggerbinder.class]

Slf4j:see http://www.slf4j.org/codes.html#multiple_bindings for an Explanation.

Slf4j:actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

2016-12-06 13:35:05,379 (lifecyclesupervisor-1-0) [INFO- Org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start ( pollingpropertiesfileconfigurationprovider.java:61)] Configuration Provider Starting

2016-12-06 13:35:05,384 (conf-file-poller-0) [INFO- Org.apache.flume.node.pollingpropertiesfileconfigurationprovider$filewatcherrunnable.run ( pollingpropertiesfileconfigurationprovider.java:133)] Reloading Configuration file:/usr/local/elk/apache-flume/ Conf/flume_hdfs.conf

2016-12-06 13:35:05,391 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,391 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,392 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,392 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,392 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:931)] Added Sinks:log-sink1 agent:agent1

2016-12-06 13:35:05,392 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,392 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,392 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,392 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration$ Agentconfiguration.addproperty (flumeconfiguration.java:1017)] Processing:log-sink1

2016-12-06 13:35:05,403 (conf-file-poller-0) [info-org.apache.flume.conf.flumeconfiguration.validateconfiguration (flumeconfiguration.java:141)] Post-validation flume configuration contains configuration for Agents: [agent1]

2016-12-06 13:35:05,403 (conf-file-poller-0) [INFO- Org.apache.flume.node.AbstractConfigurationProvider.loadChannels (abstractconfigurationprovider.java:145)] Creating Channels

2016-12-06 13:35:05,410 (conf-file-poller-0) [info-org.apache.flume.channel.defaultchannelfactory.create ( defaultchannelfactory.java:42)] Creating instance of channel CH1 type memory

2016-12-06 13:35:05,417 (conf-file-poller-0) [INFO- Org.apache.flume.node.AbstractConfigurationProvider.loadChannels (abstractconfigurationprovider.java:200)] Created Channel Ch1

2016-12-06 13:35:05,418 (conf-file-poller-0) [info-org.apache.flume.source.defaultsourcefactory.create ( defaultsourcefactory.java:41)] Creating instance of source avro-source1, type exec

2016-12-06 13:35:05,456 (conf-file-poller-0) [info-org.apache.flume.sink.defaultsinkfactory.create ( defaultsinkfactory.java:42)] Creating instance of sink:log-sink1, Type:hdfs

2016-12-06 13:35:05,465 (conf-file-poller-0) [INFO- Org.apache.flume.node.AbstractConfigurationProvider.getConfiguration (abstractconfigurationprovider.java:114)] Channel ch1 connected to [avro-source1, log-sink1]

2016-12-06 13:35:05,472 (conf-file-poller-0) [info-org.apache.flume.node.application.startallcomponents ( application.java:138)] starting new Configuration:{sourcerunners:{avro-source1=eventdrivensourcerunner: {source:o rg.apache.flume.source.execsource{name:avro-source1,state:idle}}} sinkrunners:{log-sink1=sinkrunner: {policy:[ Email protected] countergroup:{name:null counters:{}}} channels:{ch1=org.apache.flume.channel.memorychannel{name: ch1}}}

2016-12-06 13:35:05,472 (conf-file-poller-0) [info-org.apache.flume.node.application.startallcomponents ( application.java:145)] starting Channel Ch1

2016-12-06 13:35:05,535 (lifecyclesupervisor-1-0) [INFO- Org.apache.flume.instrumentation.MonitoredCounterGroup.register (monitoredcountergroup.java:120)] Monitored Counter Group for type:channel, name:ch1:Successfully registered new MBean.

2016-12-06 13:35:05,535 (lifecyclesupervisor-1-0) [INFO- Org.apache.flume.instrumentation.MonitoredCounterGroup.start (monitoredcountergroup.java:96)] Component type: CHANNEL, Name:ch1 started

2016-12-06 13:35:05,536 (conf-file-poller-0) [info-org.apache.flume.node.application.startallcomponents ( application.java:173)] starting Sink Log-sink1

2016-12-06 13:35:05,540 (lifecyclesupervisor-1-1) [INFO- Org.apache.flume.instrumentation.MonitoredCounterGroup.register (monitoredcountergroup.java:120)] Monitored Counter Group for type:sink, name:log-sink1:successfully registered new MBean.

2016-12-06 13:35:05,540 (lifecyclesupervisor-1-1) [INFO- Org.apache.flume.instrumentation.MonitoredCounterGroup.start (monitoredcountergroup.java:96)] Component Type:sink , Name:log-sink1 started

2016-12-06 13:35:05,540 (conf-file-poller-0) [info-org.apache.flume.node.application.startallcomponents ( application.java:184)] starting Source Avro-source1

2016-12-06 13:35:05,541 (lifecyclesupervisor-1-0) [info-org.apache.flume.source.execsource.start (ExecSource.java : 169)] Exec Source starting with Command:tail-n +0-f/opt/logs/appcenter.log

2016-12-06 13:35:05,543 (lifecyclesupervisor-1-0) [INFO- Org.apache.flume.instrumentation.MonitoredCounterGroup.register (monitoredcountergroup.java:120)] Monitored Counter Group for type:source, name:avro-source1:successfully registered new MBean.

2016-12-06 13:35:05,543 (lifecyclesupervisor-1-0) [INFO- Org.apache.flume.instrumentation.MonitoredCounterGroup.start (monitoredcountergroup.java:96)] Component type: SOURCE, Name:avro-source1 started

2016-12-06 13:35:05,934 (sinkrunner-pollingrunner-defaultsinkprocessor) [INFO- Org.apache.flume.sink.hdfs.HDFSDataStream.configure (hdfsdatastream.java:58)] serializer = TEXT, Userawlocalfilesystem = False

2016-12-06 13:35:06,279 (sinkrunner-pollingrunner-defaultsinkprocessor) [INFO- Org.apache.flume.sink.hdfs.BucketWriter.open (bucketwriter.java:234)] Creating hdfs://ns1/flume/161206/ Flumedata.1481002505934.tmp

2016-12-06 13:35:06,586 (hdfs-log-sink1-call-runner-0) [warn-org.apache.hadoop.util.nativecodeloader.<clinit > (nativecodeloader.java:62)] Unable to load Native-hadoop library for your platform ... using Builtin-java classes where Applicable

2016-12-06 13:36:07,615 (hdfs-log-sink1-roll-timer-0) [info-org.apache.flume.sink.hdfs.bucketwriter.close ( bucketwriter.java:363)] Closing hdfs://ns1/flume/161206/flumedata.1481002505934.tmp

2016-12-06 13:36:07,710 (hdfs-log-sink1-call-runner-7) [info-org.apache.flume.sink.hdfs.bucketwriter$8.call ( bucketwriter.java:629)] renaming hdfs://ns1/flume/161206/flumedata.1481002505934.tmp to hdfs://ns1/flume/161206/ flumedata.1481002505934

2016-12-06 13:36:07,760 (hdfs-log-sink1-roll-timer-0) [info-org.apache.flume.sink.hdfs.hdfseventsink$1.run ( hdfseventsink.java:394)] Writer Callback Called.


This article is from the "crazy_sir" blog, make sure to keep this source http://douya.blog.51cto.com/6173221/1879913

Flume collecting logs, writing to HDFs

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.