Project requirements is the online server generated log information real-time import Kafka, using agent and collector layered transmission, app data passed through the thrift to agent,agent through Avro Sink to send the data to collector, Collector The data together and sends it to Kafka, the topology is as follows:
The problems encountered during debugging and the resolution are documented as follows:
1, [Error-org.apache.thrift.server.abstractnonblockingserver$framebuffer.invoke (AbstractNonblockingServer.java : 484)] unexpected throwable while invoking!
Java.lang.OutOfMemoryError:Java Heap Space
Reason:The default maximum heap memory size of the flume is 20M, the actual environment in the large amount of data, it is easy to oom problems, in the Flume base configuration file under the flume-env.sh of Conf
Export java_opts= "-xms2048m-xmx2048m-xss256k-xmn1g-xx:+useparnewgc-xx:+useconcmarksweepgc-xx:- Usegcoverheadlimit "
And in the Flume boot script flume-ng, modify java_opts= "-xmx20m" to java_opts= "-xmx2048m"
Here we jump the heap memory threshold to 2G, the actual production environment can be adjusted according to the specific hardware conditions
2, [Error-org.apache.thrift.server.tthreadedselectorserver$selectorthread.run (tthreadedselectorserver.java:544) ] Run () exiting due to uncaught error
Java.lang.OutOfMemoryError:unable to create new native thread
Cause: If the app sends data to Flume's thrift source, it uses a short connection, creates the thread infinitely, and uses the command Pstree to discover that the number of Java threads increases with the amount of data being sent, eventually reaching more than 65,500, Beyond the limitations of the Linux system on threads, the workaround is to add a limit to the number of threads in the thrift source configuration entry.
Agent.sources.r1.threads = 50
Reboot agent found Java thread number up to 70 will no longer grow
3. caused By:org.apache.flume.ChannelException:Put queue for memorytransaction of capacity full, consider committing More frequently, increasing capacity or increasing thread count
Reason: This is the error caused by the memory channel, memory channel default up to only 100 data, in the production environment is obviously not enough, you need to increase the capacity parameters
4. Warn: "Thrift source%s could not append events to the channel.".
Reason: View the Flume configuration document to find that the default Kafka for various types of sink (thrift, Avro, batch-size, etc.) are 100,file channel, memory Channel's transactioncapacity defaults are also 100, and if you modify sink batch-size, you need to set the batch-size to a value that is less than or equal to channel's transactioncapacity. Otherwise, the above warn will cause the data not to be sent properly.
5, Agent Department report
(Sinkrunner-pollingrunner-defaultsinkprocessor) [Error-org.apache.flume.sinkrunner$pollingrunner.run (sinkrunner.java:160)] Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException:Failed to send events
At Org.apache.flume.sink.AbstractRpcSink.process (abstractrpcsink.java:392)
At Org.apache.flume.sink.DefaultSinkProcessor.process (defaultsinkprocessor.java:68)
At Org.apache.flume.sinkrunner$pollingrunner.run (sinkrunner.java:147)
At Java.lang.Thread.run (thread.java:744)
caused by:org.apache.flume.EventDeliveryException:NettyAvroRpcClient {host:10.200.197.82, port:5150}: Failed to send Batch
At Org.apache.flume.api.NettyAvroRpcClient.appendBatch (nettyavrorpcclient.java:315)
At Org.apache.flume.sink.AbstractRpcSink.process (abstractrpcsink.java:376)
... 3 more
caused by:org.apache.flume.EventDeliveryException:NettyAvroRpcClient {host:10.200.197.82, port:5150}: Exception thr Own from remote handler
At Org.apache.flume.api.NettyAvroRpcClient.waitForStatusOK (nettyavrorpcclient.java:397)
At Org.apache.flume.api.NettyAvroRpcClient.appendBatch (nettyavrorpcclient.java:374)
At Org.apache.flume.api.NettyAvroRpcClient.appendBatch (nettyavrorpcclient.java:303)
... 4 more
caused by:java.util.concurrent.ExecutionException:java.io.IOException:Connection reset by peer
At Org.apache.avro.ipc.CallFuture.get (callfuture.java:128)
At Org.apache.flume.api.NettyAvroRpcClient.waitForStatusOK (nettyavrorpcclient.java:389)
... 6 more
caused by:java.io.IOException:Connection reset by peer
At Sun.nio.ch.FileDispatcherImpl.read0 (Native method)
At Sun.nio.ch.SocketDispatcher.read (socketdispatcher.java:39)
At Sun.nio.ch.IOUtil.readIntoNativeBuffer (ioutil.java:223)
At Sun.nio.ch.IOUtil.read (ioutil.java:192)
At Sun.nio.ch.SocketChannelImpl.read (socketchannelimpl.java:379)
At Org.jboss.netty.channel.socket.nio.NioWorker.read (nioworker.java:59)
At Org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys (abstractnioworker.java:471)
At Org.jboss.netty.channel.socket.nio.AbstractNioWorker.run (abstractnioworker.java:332)
At Org.jboss.netty.channel.socket.nio.NioWorker.run (nioworker.java:35)
At Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1145)
At Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:615)
... 1 more
Collector Newspaper
2017-08-21 16:36:43,010 (New I/O worker #12) [warn-org.apache.avro.ipc.nettyserver$ Nettyserveravrohandler.exceptioncaught (nettyserver.java:201)] unexpected exception from downstream.
org.apache.avro.AvroRuntimeException:Excessively large list allocation request detected:349070535 items! Connection closed.
At Org.apache.avro.ipc.nettytransportcodec$nettyframedecoder.decodepackheader (nettytransportcodec.java:167)
At Org.apache.avro.ipc.nettytransportcodec$nettyframedecoder.decode (nettytransportcodec.java:139)
At Org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode (framedecoder.java:422)
At Org.jboss.netty.handler.codec.frame.FrameDecoder.cleanup (framedecoder.java:478)
At org.jboss.netty.handler.codec.frame.FrameDecoder.channelDisconnected (framedecoder.java:366)
At org.jboss.netty.channel.Channels.fireChannelDisconnected (channels.java:399)
At Org.jboss.netty.channel.socket.nio.AbstractNioWorker.close (abstractnioworker.java:721)
At Org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket ( nioserversocketpipelinesink.java:111)
At Org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk (nioserversocketpipelinesink.java:66 )
At Org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream (onetooneencoder.java:54)
At Org.jboss.netty.channel.Channels.close (channels.java:820)
At Org.jboss.netty.channel.AbstractChannel.close (abstractchannel.java:197)
At Org.apache.avro.ipc.nettyserver$nettyserveravrohandler.exceptioncaught (nettyserver.java:202)
At Org.apache.avro.ipc.nettyserver$nettyserveravrohandler.handleupstream (nettyserver.java:173)
At Org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught (framedecoder.java:378)
At Org.jboss.netty.channel.Channels.fireExceptionCaught (channels.java:533)
At Org.jboss.netty.channel.AbstractChannelSink.exceptionCaught (abstractchannelsink.java:48)
At Org.jboss.netty.channel.Channels.fireMessageReceived (channels.java:268)
At Org.jboss.netty.channel.Channels.fireMessageReceived (channels.java:255)
At Org.jboss.netty.channel.socket.nio.NioWorker.read (nioworker.java:84)
At Org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys (abstractnioworker.java:471)
At Org.jboss.netty.channel.socket.nio.AbstractNioWorker.run (abstractnioworker.java:332)
At Org.jboss.netty.channel.socket.nio.NioWorker.run (nioworker.java:35)
At Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1145)
At Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:615)
At Java.lang.Thread.run (thread.java:745)
Cause: when the agent to collector data in the agent Avro Sink compression, at the collector Avro source must be decompressed, otherwise the data can not be sent
6. Org.apache.kafka.common.errors.RecordTooLargeException:There are some messages at [Partition=offset]: {ssp_ package-0=388595} whose the size is larger than the fetch size 1048576 and hence cannot as ever. Increase the fetch size, or decrease the maximum message size the broker would allow.
2017-10-11 01:30:10,000 (POLLABLESOURCERUNNER-KAFKASOURCE-R1) [ERROR- Org.apache.flume.source.kafka.KafkaSource.doProcess (kafkasource.java:314)] Kafkasource EXCEPTION, {}
Cause: When configuring Kafka source, Flume as Kafka consumer, consumer consumption Kafka data, the default maximum file size is 1m, if the file size of more than 1m, you need to manually adjust the parameters in the configuration,
However, in the Flume official website configuration Instructions-kakka Source, did not find the configuration fetch size, but in the last line of configuration has a
The other Kafka Consumer Properties--these Properties are used to configure the Kafka Consumer. Any consumer property supported by Kafka can is used. The only requirement are to prepend the property name with the prefix kafka.consumer. For Example:kafka.consumer.auto.offset.reset
Configured here is the Kafka configuration method, Kafka official website of the configuration document-consumer Configs-max.partition.fetch.bytes have related instructions
Agent.sources.r1.kafka.consumer.max.partition.fetch.bytes = 10240000
Here, add the consumer fetch.byte to 10m
7, 2017-10-13 01:19:47,991 (sinkrunner-pollingrunner-defaultsinkprocessor) [ERROR- Org.apache.flume.sink.kafka.KafkaSink.process (kafkasink.java:240)] Failed to publish events
Java.util.concurrent.ExecutionException:org.apache.kafka.common.errors.RecordTooLargeException:The message is 2606058 bytes when serialized which are larger than the maximum request size you have configured with the Max.request.size Configuration.
At Org.apache.kafka.clients.producer.kafkaproducer$futurefailure.<init> (kafkaproducer.java:686)
At Org.apache.kafka.clients.producer.KafkaProducer.send (kafkaproducer.java:449)
At Org.apache.flume.sink.kafka.KafkaSink.process (kafkasink.java:212)
At Org.apache.flume.sink.DefaultSinkProcessor.process (defaultsinkprocessor.java:67)
At Org.apache.flume.sinkrunner$pollingrunner.run (sinkrunner.java:145)
At Java.lang.Thread.run (thread.java:745)
caused by:org.apache.kafka.common.errors.RecordTooLargeException:The message ls 2606058 bytes when serialized which is L Arger than the maximum request size you have configured with the Max.request.size configuration.
reason: similar to the previous point, here is Kafka sink, Flume as producer, but also to set the file fetch size, also refer to the Kafka website configuration
Agent.sinks.k1.kafka.producer.max.request.size = 10240000
8, Java.io.IOException:Too many open files
at Sun.nio.ch.ServerSocketChannelImpl.accept0 (Native)
at Sun.nio.ch.ServerSocketChannelImpl.accept (serversocketchannelimpl.java:250)
at Org.mortbay.jetty.nio.selectchannelconnector$1.acceptchannel (selectchannelconnector.java:75)
at Org.mortbay.io.nio.selectormanager$selectset.doselect (selectormanager.java:686)
at Org.mortbay.io.nio.SelectorManager.doSelect (selectormanager.java:192)
at Org.mortbay.jetty.nio.SelectChannelConnector.accept (selectchannelconnector.java:124)
at Org.mortbay.jetty.abstractconnector$acceptor.run (abstractconnector.java:708)
at Org.mortbay.thread.queuedthreadpool$poolthread.run (queuedthreadpool.java:582)
Reason: the file handle takes up too much, first view the number of flume occupy handles
Lsof-p PID | Wc-l
PID is the flume process number,
Vim/etc/security/limits.conf
In the final join
* Soft nofile 4096
* Hard Nofile 4096
The first * means all users, restart the Flume service after the change is completed
9, (Kafka-producer-network-thread | producer-1) [Error-org.apache.kafka.clients.producer.internals.sender.run ( sender.java:130)] uncaught
Error in Kafka producer I/O Thread:
Org.apache.kafka.common.protocol.types.SchemaException:Error reading field ' Throttle_time_ms ': Java.nio.BufferUnderflowException
at Org.apache.kafka.common.protocol.types.Schema.read (schema.java:71)
at Org.apache.kafka.clients.NetworkClient.handleCompletedReceives (networkclient.java:439)
at Org.apache.kafka.clients.NetworkClient.poll (networkclient.java:265)
at Org.apache.kafka.clients.producer.internals.Sender.run (sender.java:216)
at Org.apache.kafka.clients.producer.internals.Sender.run (sender.java:128)
at Java.lang.Thread.run (thread.java:744)
Reason:Kafka cluster version older, flume version newer, here Kafka version is older 0.8.2, Flume use 1.7 will report the above error, can only be reduced to 1.6 version of the flume
9, sink to Kafka on the data is not evenly distributed on each partition, but all on the same partition
Reason: This is the old version of Flume left a bug, you need to construct a key in the event of a header value pairs can achieve the goal
A1.sources.flume0.interceptors.i1.type = Org.apache.flume.sink.solr.morphline.uuidinterceptor$builder
A1.sources.flume0.interceptors.i1.headerName = key
Really no random reason this article is not directly to find, is the use of a different way to solve the problem