Hadoop2.0 Introduction to the source package
1 , unzip the source package:
2 , directory structure:
hadoop-common-project: The directory where thehadoop Base library resides, such as RPC, Metrics, counter, and so on. Contains the underlying libraries that all other modules might use.
Hadoop -mapreduce-project : The implementation of themapreduce Framework , in the first generation of Mr i.e. MRv1, mapreduce by the programming model (Map/reduce), the dispatch system (Jobtracker and Tasktracker ) and the Data processing engine (Maptask and Reducetask) and other modules, while the MRV2 resource scheduling function is completed by the new yarn. The other two invariant, itself contains only very simple task assignment function.
hadoop-hdfs-project:hadoop Distributed File System implementation , Hadoop1.0 is a single namenode implementation, while Hadoop2.0 supports multiple namenode, while resolving namenode single node failure.
hadoop-yarn-project: TheHadoop Resource management system yarn implementation . The system can manage the resources in the system uniformly, and assign to each application process according to certain policies.
3 , Hadoop2.0 New Branch YARN of the Hadoop-yarn-project Directory:
hadoop-yarn-api:yarn API, which gives the JAVA Declaration of the 4 major RPC protocols involved in yarn content and Protocol buffers definition . The four RPCs were Applicationclientprotocol, Applicationmasterprotocol, Containermanagementprotocol and Resourcemanageradministrationprotocol.
Hadoop-yarn-common:yarn Common, contains the yarn Underlying library implementations, including event libraries, service libraries, State Library, web interface library, and so on.
hadoop-yarn-applications:yarn Applications, including two application programming examples , Distributedshell and unmanaged are respectively.
hadoop-yarn-client:yarn client, which encapsulates several libraries related to the yarn RPC Protocol interaction , User-friendly development of application programs.
hadoop-yarn-server:yarn Server, which gives the Core implementation of yarn , This includes the implementation of core components such as ResourceManager, NodeManager, and resource manager.
Hadoop2.0 Source Package Introduction