2.2 Hadoop Configuration Detailed
Instead of using the Java.util.Properties management profile and using the Apache Jakarta Commons configuration profile, Hadoop uses a unique set of configuration file management systems and provides its own API , which is to use org.apache.hadoop.conf.Configuration to process configuration information.
2.2.1 The format of the Hadoop configuration file
The Hadoop configuration file is in XML format, and here is an example of a Hadoop configuration file:
<?xml version= "1.0"?> <?xml-stylesheet type= "text/xsl" href= "configuration.xsl"? > <configuration> <property> <name >io.sort.factor</name> <value>10</value> <description>the number of streams to merge at once while sorting files. this determines the number of open file handles.</description> </ property> <property> <name>dfs.name.dir< /name> <value>${hadoop.tmp.dir}/dfs/name</value> <description>determines where on the local filesystem the&nBsp;dfs name nodeshould store the name table ( Fsimage) . ......</description> </property> <property > <name>dfs.web.ugi</name> <value>webuser,webgroup</value> <final>true</ final> <description>the user account used by the web interface. Syntax: USERNAME,GROUP1,GROUP2, ......</description> </property> </configuration>
The root element of the Hadoop configuration file is the configuration, which typically contains only the child element property. Each property element is a configuration item and the profile does not support layering or grading. Each configuration item typically includes the name of the configuration property, a value of values, and a description of the configuration item description; The element final is similar to the keyword in Java, meaning that the configuration item is "fixed." Final generally does not appear, but when you merge resources, you prevent the values of the configuration items from being overwritten.
In the example file above, the value of the configuration item Dfs.web.ugi is "Webuser,webgroup", which is a final configuration item, and from description, this configuration item configures the user account for the Hadoop Web interface, including user name and user group information. This information can be accessed through the methods provided by the configuration class.
In a configuration, each property is of type string, but the value type may be of the following types, including basic types in Java, such as Boolean (Getboolean), int (getInt), Long (Getlong), Float (getfloat) can also be other types, such as String (get), Java.io.File (getFile), string array (getstrings), and so on. As an example of the above configuration file, GetInt ("Io.sort.factor") returns an integer of 10, while Getstrings ("Dfs.web.ugi") returns an array of strings with two elements, WebUser and Webgroup, respectively.
Merging resources refers to merging multiple profiles, resulting in a configuration. If there are two configuration files, i.e. two resources, such as Core-default.xml and Core-site.xml, they are combined into one configuration through the Loadresources () method of the configuration class. The code is as follows:
configurationconf = new Configuration (); Conf.addresource ("Core-default.xml"); Conf.addresource ("Core-site.xml");
If both of the configuration resources contain the same configuration items, and the configuration items for the previous resource are not marked as final, the subsequent configuration overrides the previous configuration. In the example above, the configuration in Core-site.xml will overwrite the configuration with the same name in Core-default.xml. If a configuration item is marked final in the first resource (Core-default.xml), a warning is given when the second resource is loaded.