Monitoring system of Big Data system (ii) expansion of Flume

Source: Internet
Author: User
<span id="Label3"></p><p style="text-align: left;" align="center"><p style="text-align: left;" align="center">Some of the requirements are not flume by the native, so we have added many features based on the open source Flume.</p></p><strong><strong>Event</strong></strong><strong><strong>Deserializer</strong></strong><strong><strong>the defect</strong></strong><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">The deserializer of each source corresponding to the flume must implement the interface eventdeserializer, which defines the Readevent/readevents method to read the event from various log sources.</p></p><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">The flume mainly supports two types of deserialization:</p></p><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">(1) avroeventdeserializer: resolves the Avro container File's Deserializer. Generates a flume event for each record of the Avro file and stores Avro encoded binary records into the event Body.</p></p><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">(2) Linedeserializer: It is a log-file-based deserializer that divides each line into a log with a "\ n" line Terminator.</p></p><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">This requirement cannot be satisfied when the log record itself is split into multiple rows (such as a stack exception log).</p></p><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">For this scenario, the parsing of the log is re-implemented for the actual project. source see Https://github.com/bigdatafly/flume in the Fileeventreader.</p></p><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">off-topic, recently looked over the morphlines, log parsing can also be implemented with Morphlines.</p></p><p class="p" style="margin-left: 30px;"><p class="p" style="margin-left: 30px;">In addition, there is a place to note: Linedeserializer has a parameter (maxlinelength) that defines the maximum number of characters for a journal Line. If a log exceeds this length, it will no longer be read. While a log occupies multiple lines, This value needs to be increased as the stack length of the Exception log is significantly longer than the normal log, where you can set it to 8192.</p></p><strong><strong>Execsource</strong></strong><strong><strong>the defect</strong></strong><p class="p"><p class="p">Execsource Tail-f is suitable for fixed file log read, The biggest problem does not support the function of file breakpoint Continuation. To this end, in the source code based on the realization of the FLUME-FILETAILSOURCE.</p></p><p class="p"><p class="p">source see the Filetailsource.java in Https://github.com/bigdatafly/flume</p></p><strong><strong>Spoolingdirsource's Flaws</strong></strong><p class="p"><p class="p">Used to monitor file directory changes, But there are two issues: first, the file can not be written, only Read. second, the delay is relatively high, need to wait for the log to be archived regularly. This method is not used in the Project.</p></p><p><p>Here is a small episode, because the Source/sink has been customized for the sake of. I thought Deserializer could be customized in the same way. and specify the fully qualified name of the custom Deserializer in the Agent's deserializer Configuration. But after the verification found that the road does not go through, will be error (seemingly from the Flume official web site also can not find the introduction of Deserializer custom). therefore, you can only expand on the source code, and then compile the source code, regenerate the jar.</p></p><p class="p"><p class="p">From the source you will find out why it is not feasible to extend Deserializer in a third-party package. See Org.apache.flume.serialization.EventDeserializerType and you'll be at a glance:</p></p><pre><span style="color: #008080;"><span style="color: #008080;">1</span></span> <span style="color: #0000ff;"><span style="color: #0000ff;"></span> public</span> <span style="color: #0000ff;"><span style="color: #0000ff;">enum</span></span><span style="color: #000000;"><span style="color: #000000;">Eventdeserializertype {</span></span><span style="color: #008080;"><span style="color: #008080;">2</span></span>Line (linedeserializer.builder.<span style="color: #0000ff;"><span style="color: #0000ff;">class</span></span><span style="color: #000000;"><span style="color: #000000;">), </span></span><span style="color: #008080;"><span style="color: #008080;">3</span></span>AVRO (avroeventdeserializer.builder.<span style="color: #0000ff;"><span style="color: #0000ff;">class</span></span><span style="color: #000000;"><span style="color: #000000;">), </span></span><span style="color: #008080;"><span style="color: #008080;">4</span></span>Other (<span style="color: #0000ff;"><span style="color: #0000ff;">NULL</span></span><span style="color: #000000;"><span style="color: #000000;">); </span></span><span style="color: #008080;"><span style="color: #008080;">5</span></span> <span style="color: #0000ff;"><span style="color: #0000ff;">Private</span></span> <span style="color: #0000ff;"><span style="color: #0000ff;">Final</span></span>class<?<span style="color: #0000ff;"><span style="color: #0000ff;">extends</span></span>Eventdeserializer.builder><span style="color: #000000;"><span style="color: #000000;">builderclass; </span></span><span style="color: #008080;"><span style="color: #008080;">6</span></span>Eventdeserializertype (class<?<span style="color: #0000ff;"><span style="color: #0000ff;">extends</span></span>Eventdeserializer.builder><span style="color: #000000;"><span style="color: #000000;">Builderclass) { </span></span><span style="color: #008080;"><span style="color: #008080;">7</span></span> <span style="color: #0000ff;"><span style="color: #0000ff;"></span> this</span>. Builderclass =<span style="color: #000000;"><span style="color: #000000;">builderclass; </span></span><span style="color: #008080;"><span style="color: #008080;">8</span></span> <span style="color: #000000;"><span style="color: #000000;">} </span></span><span style="color: #008080;"><span style="color: #008080;">9</span></span> <span style="color: #0000ff;"><span style="color: #0000ff;"></span> public</span>class<?<span style="color: #0000ff;"><span style="color: #0000ff;">extends</span></span>Eventdeserializer.builder><span style="color: #000000;"><span style="color: #000000;">getbuilderclass () {</span></span><span style="color: #008080;"><span style="color: #008080;">Ten</span></span> <span style="color: #0000ff;"><span style="color: #0000ff;">return</span></span><span style="color: #000000;"><span style="color: #000000;">builderclass; </span></span><span style="color: #008080;"><span style="color: #008080;"></span> one</span> <span style="color: #000000;"><span style="color: #000000;">} </span></span><span style="color: #008080;"><span style="color: #008080;"></span> a</span>}</pre><p><p>You must explicitly define the Deserializer enumeration here, then specify the class instance of its builder, and fill in the enumeration name you have here in the Deserializer configuration item in the Agent.</p></p><strong><strong>management issues of the system</strong></strong><p><p>There are two ways to start loading a configuration file flume: conf configuration file and zookeeper Mode. The flume monitors the Conf or zookeeper. When the configuration information changes, the configuration parameters are reinitialized and Restarted. At present, the flume parameters are stored uniformly on the zookeeper System. By looking through the source code, found to solve this problem need to rewrite a lot of source code, the task is huge, is still thinking how the actual situation how to solve the problem skillfully.</p></p><p><p>In the actual project implementation, the whole flume structure is divided into two layers of agent and Collector.</p></p><p><p>SOURCE See Https://github.com/bigdatafly/flume</p></p><p><p>Monitoring system of Big Data system (ii) expansion of Flume</p></p></span>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.