The Hadoop network topology plays an important role in the whole system, it affects the start (registration) of Datanode, the allocation of maptask, and so on. Understanding the network topology can be a great help in understanding the operation of the entire Hadoop.
The following two graphs are the first to understand the classes related to network topology.
networktopology is used to represent the network topology of a Hadoop cluster. Hadoop organizes the entire network topology into a tree structure (refer to this article https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_ Proposal.pdf), where the node interface represents the node of a tree, either as an internal node (such as data center,rack) or as a leaf node (or host), and Nodebase implements node The networktopology.innernode represents the internal node of the tree. When the Datanode is launched in the form of Datanoderegistration, to namenode to register the information of this node, so that namenode through the network topology to determine the location of the Datanode in the network topology.
The dnstoswitchmapping is used to convert node in the cluster into a corresponding network location.
Hadoop Network topology Analysis: Networktopology and Dnstoswitchmapping