Flume source analysis-use Eclipse to flume source for remote debugging Analysis environment Construction (a)

Source: Internet
Author: User
Tags centos

First, Introduction
Recently in the study of Big data analysis related work, for which the use of the collection part used to Flume, deliberately spent a little time to understand the flume work principle and working mechanism. A personal understanding of a new system first, after a rough understanding of its rationale, and then from the source code to understand some of its key implementation part, and finally try to modify some of the content, so as to deepen its understanding. About flume principle part of the relevant information online a lot, here to introduce my source analysis environment of the building process.
Ii. Introduction of the Environment
2. CentOS 7.0
3. Eclipse Java EE Kepler
4, jdk-6u45-linux-x64.rpm
5. Source Insight
6. Oracle Virtual Box
third, the analysis method
Install the JDK in CentOS, turn on the remote debugging function of Eclipse, and perform the tracking and dispatch analysis. Compared with direct read source directly with source insight efficiency is more efficient, but in the analysis process can use the Source Insight Auxiliary analysis class between the reference relationship.
Iv. Analysis Steps
1, installation of JDK and Flume in CentOS, the installation process is no longer more described. In this process CentOS and flume are installed in virtual box with IP address
2, set flume startup parameters, use Notepad to open apache-flume-1.6.0-src\bin\flume-ng and edit, mainly the following items

Java_opts= "-xmx20m"

Revision changed to

Java_opts= "-xmx20m-xdebug-xrunjdwp:transport=dt_socket,address=8888,server=y,suspend=y"

This sets the port for remote debugging to 8888. Remote debugging related parameter settings see here
3. Set the remote debugging capabilities of Eclipse, as shown in the following figure:

4. Use Eclipse Java EE Kepler to import an existing Maven project, the flume source code, as shown in the following figure:

5. java files in eclipse ... \apache-flume-1.6.0-src\flume-ng-node\src\main\java\org\apache\flume\node\ Set breakpoints in Application.java and start Flume in CentOS to remotely debug the flume startup process. Thus, the whole flume source code is tracked and analyzed as an entry point.
Note: to start flume, then use Eclipse for debugging, otherwise you will not be able to connect.
Five, frequently asked questions
With the introduction of the MAVEN project from Eclipse, there are a number of errors, and some common workarounds are as follows:
1, the most common is due to the wall, maven.twttr.com and some libraries in Google can not download, try several, the best solution see here, that is, in the Pom file to add the following content

 <url>http://maven.oschina.net/service /local/repositories/sonatype-public-grid/content/</url>

2, for version issues see here.
3, Tools.jar problem see here. See here for
4, Plugin execution not covered by lifecycle configuration problems.
5, a similar error occurred avroflumeogevent cannot be resolved, as shown in the following figure, the corresponding class could not be found,

This is because Avro is used and the Pom file needs to be Generate-source , use the Avro-maven-plugin plugin to generate the appropriate Java file, and then add the corresponding library address to resolve the above problem.

vi. precautions
1, because the compatibility between each version of the JDK is not good, in order to better read the source code must understand the JDK version used by each software, such as the JDK required by eclipse, Flume required JDK and so on, otherwise there will be some strange problems, difficult to solve.
2, Flume in the construction process to use a lot of open-source mature systems, such as Avro, Netty, Maven and so on, and Kafka and so on, there are also intersection, so in the analysis process need to understand the relevant open source system content.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.