1.2 prepare the source code reading environment

Source: Internet
Author: User
1.2 prepare the source code reading environment before studying an open source project, you must install and configure the basic development environment and the source code reading environment. This series of content includes: install and configure JDK, install development and debugging IDE, install and configure related auxiliary tools, etc. 1.2.1 install and configure JDK before analyzing the source code of Hadoop, some preparation work is required,

1.2 prepare the source code reading environment before studying an open source project, you must install and configure the basic development environment and the source code reading environment. This series of content includes: install and configure JDK, install development and debugging IDE, install and configure related auxiliary tools, etc. 1.2.1 install and configure JDK before analyzing the source code of Hadoop, some preparation work is required,

1.2 prepare the source code reading environment

Before studying an open-source project, you must install and configure the basic development environment and the source code reading environment. This series of content includes: install and configure JDK, install development and debugging IDE, install and configure related auxiliary tools, etc.

1.2.1 install and configure JDK

Before analyzing the source code of Hadoop, you need to make some preparations. Building a Java environment is essential. The running environment of Hadoop requires Java 1.6 or later. Open hosts.

After installation, check whether the JDK is correctly configured.

Some third-party programs will add their JDK paths to the system PATH environment variables. In this way, even if the latest JDK version is installed, the system will still use the JDK included in the third-party program. In Windows, Java Runtime Environment variables that need to be correctly configured include JAVA_HOME, CLASSPATH, and PATH.

For convenience, we often specify a system-level environment variable for the operating system itself. For example, the system environment variables on Windows can be found in the "advanced" tab of "System Properties", where JAVA_HOME, PATH, and CLASSPATH values can be configured. -3 is an example of adding the JAVA_HOME environment variable to the system in Windows XP.

After the installation and configuration are complete, enter the "java-version" command in the command line window to check the current JDK running version. If the configuration is correct, the JRE running version of the current client is displayed, as shown in figure 1-4.

1.2.2 install Eclipse

After successfully installing and configuring JDK, you also need to install the Integrated Development Environment (IDE) for Java Development and debugging ), because a good development environment and source code reading environment can make your work more efficient. Currently, the commonly used Java Development IDE mainly includes Eclipse and NetBeans. You can choose your preferred IDE as the development tool. This book uses the Eclipse integrated development environment as an example to describe how to develop and debug the source code in Eclipse. Readers can also make corresponding attempts in other ides.

Eclipse is a user-friendly open-source IDE that supports thousands of different plug-ins, providing great convenience for code analysis and source code debugging. You can find each version of Eclipse on the Eclipse official website (http://www.eclipse.org/downloads/) (Analysis of Hadoop source code, just download Eclipse IDE for Java SE Developers) and download the installation. Eclipse download page 1-5 is shown. Eclipse is a Java-based green software that can be directly used after being decompressed and downloaded with a ZIP package. The basic usage of Eclipse is beyond the scope of this book. Therefore, we will only brief you on how to use Eclipse for some basic source code analysis.

1. Locate a class, method, and attribute

In the process of source code analysis, you sometimes need to quickly locate a class, method, and attribute at the cursor position. In Eclipse, you can press the F3 key, you can easily view the declaration of classes, methods, and variables and the source code of the definition.

Sometimes, when viewing the source code of classes, methods, and variables declared/defined in the JDK library, the corresponding CLASS file (bytecode) is opened. Therefore, Eclipse provides a function, associate the bytecode with the source code so that you can view (provide the source code) The implementation of a third-party library.

When Eclipse opens the bytecode file, you can click the "Attach Source" button to associate the bytecode with the Source code, as shown in figure 1-6.

) Bound to "rt. jar". After accessing the bytecode file in the JAR package, Eclipse will automatically display the corresponding source code file.

The loading method for source code files of other third-party Java Plug-ins is similar.

2. Find the corresponding class based on the class name

If you know the name of the Java class you want to Open in the editor, the easiest way to find and Open it is to use the shortcut key Ctrl + Shift + T (or click Navigate → Open Type) open the Open Type window, and enter a name in the window. Eclipse displays a list of matching types that can be found. -7: All classes whose names contain "HDFS" in Hadoop 1.0 are displayed.


Note that in addition to entering the complete class name, you can also use "*" and "?" Wildcard to match "any" or "single" characters respectively.

3. view the class inheritance Structure

Java is an object-oriented programming language. inheritance is one of the three main features of object-oriented programming. It can better understand the position of classes and interfaces in inheritance relationships and the working principles of code. Select a class and use the Ctrl + T shortcut key (or click Navigate → Quick Type Hierarchy) to display the Type Hierarchy.

The hierarchy displays the child type of the selected element. From 1 to 8, the list shows all known classes of org. apache. hadoop. fs. FileSystem.

4. Analyze the call relationship of Java Methods

In Eclipse, you can analyze the call relationship of Java methods. The specific method is as follows: select the corresponding method definition in the code area, right-click the Open Call Hierarchy item or press Ctrl + Alt + H to view the Call relationship of the method in the Call Hierarchy view, this view also provides method call tracing at a layer, which is very useful for finding the mutual call relationships between methods, as shown in figure 1-9.


Note that the shortcut key is the most convenient skill in daily development and debugging. The shortcut keys in Eclipse are also profound and profound, which are not listed here. Readers can keep these shortcuts in mind during actual development, because they are also essential for daily development. You can also refer to these shortcut keys in Eclipse to find the corresponding shortcut key settings in other ides.

1.2.3 install the auxiliary tool Ant

After JDK and Eclipse are installed and configured, Ant is also required to compile Hadoop.

Building a complex project such as Hadoop is not just as simple as compiling and packaging Java source files. All the resources used in the project need to be reasonably arranged, for example, some files need to be copied to the specified location, some classes need to be put into a JAR archive file, and other classes need to be put into another JAR archive file. If all these tasks are manually executed, project Construction and deployment will become very difficult and errors will inevitably occur. Ant is a build tool launched to address these issues and has been widely used in Java projects.

Ant is cross-platform, scalable, and efficient. With Ant, developers only need to write an XML-based configuration file (the file name is generally build. xml) to define various build tasks, such as copying files, compiling Java source files, packaging JAR archive files, and the dependencies between these build tasks, for example, the build task "package JAR archive files" requires another build task "compile Java source files ". Ant will build, package, or even deploy the project based on the dependencies and build tasks in this file.

Like Hadoop, Ant is also a project supported by the Apache Foundation. You can download it at http://ant.apache.org/bindownload.cgiand download the page 1-10.

Similar to Eclipse, Ant is a green software and does not need to be installed. After extracting the downloaded file, you need to configure it. You need to add the environment variable ANT_HOME (pointing to the Ant root directory ), and modify the environment variable PATH (in Windows, add % ANT_HOME % \ bin to PATH ). After the installation and configuration are complete, you can enter the "ant-version" command in the command line window to check whether Ant is correctly set.

Ant of Hadoop also uses Apache Ivy, a sub-project of Ant to manage external build dependencies of the project. External build dependency refers to the dependency on source code or JAR archive files from other projects to build a software development project. For example, a Hadoop project relies on log4j as a logging tool, these external dependencies make building software complex. For a small project, a simple and feasible method is to put all the projects (JAR files) It depends on into a directory (usually lib), but when the project becomes large, this method looks clumsy. Another Apache build tool, Maven, introduces the concept of a JAR file public repository, which is accessed through the external dependency item declaration and public repository (through HTTP protocol, automatically search for external dependencies and download them to meet the dependency requirements during the build.

Ivy provides the most consistent, repeatable, and easy-to-maintain methods in the Ant environment to manage all build dependencies of a project. Similar to Ant, Ivy also requires developers to write a configuration file in XML format (generally named ivy. xml) to list all dependencies of the project, and write an ivysettings. xml file (you can name this file at will), used to configure the repository for downloading dependency JAR files. You can use Ant's two Ivy tasks ivy: settings and ivy: retrieve to automatically search for dependencies and download the corresponding JAR files.

1.2.4 install a UNIX-like Shell environment Cygwin

For readers working on Windows, Cygwin for UNIX-like Shell environments also needs to be prepared.

Note that users who analyze and construct Hadoop code in Linux and other UNIX-like systems can skip this section.

Cygwin is a Windows-like UNIX Shell environment consisting of two components: the unix api Library (which simulates many features provided by the UNIX operating system ), the Bash Shell rewrite version and many UNIX utilities on this basis provide a familiar UNIX command line interface.

The cygwininstaller setup.exe is a standard Windows program that allows you to install or reinstall software and add, modify, or upgrade Cygwin components. The download page is shown at http://cygwin.com/index.html,1-11.


Execute the installation program setup.exe and Select the UNIX online editor sed from step 4 (Cygwin Setup-Select Package), as shown in 1-12 (sed can be quickly found in the Search input box ).


When SED is installed, setup.exe automatically installs the packages it depends on. To add the entire category or a separate package. To build Hadoop in Windows, you only need the text processing tool sed.

After the installation is complete, use the Start Menu or double-click the Cygwin icon to Start Cygwin. Run the "ant-version | sed" s/version/Version/g "command in the Shell environment to verify the Cygwin environment, as shown in figure 1-13.

After successfully installing JDK, Eclipse, Ant, and Cygwin, you can start preparing the Eclipse environment for Hadoop source code analysis.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.