Preface
This article describes some of the common knowledge of Hadoop. To say and the difference of other manual on the net, that is this is the author writes a set of system document, do not do arbitrarily.
Common HDFs command hadoop fs-ls uri Hadoop fs-du-h uri Hadoop fs-cat uri [file larger, Hadoop fs-cat xxxx | head] Hadoop fs-p UT uri Hadoop fs-get uri Hadoop fs-rmr uri Hadoop fs-stat%b,%o,%n,%r,%y uri (%b: File size,%o:block size,%n: file name,%r: Number of replicas,%y or%y: Last modified date and time) Hadoop Fs-tail [-f] URI HDFs dfsadmin-report hadoop fs-appendtofile uri1[,uri2,...] URI (Hadoop fs-appendtofile helloCopy1.txt hellocopy2.txt/user/tmp/hello.txt) Hadoop fsck/-files-blocks
Here is a simple list, follow up if necessary in the detailed.
Reboot lost node child node Datanode missing
sbin/hadoop-daemon.sh Start Datanode child node NodeManager missing
sbin/yarn-daemon.sh start NodeManager primary node is missing
sbin/start-all.sh
Or
sbin/hadoop-daemon.sh Start Namenode
sbin/hadoop-daemon.sh Start Secondarynamenode
sbin/yarn-daemon.sh start ResourceManager
configuration File Error
Managing Hadoop clusters often encounters problems with configuration files. Here is an example, such as yarn's nodemanager problem.
Yarn has two related configuration files: Yarn-site.xml and yarn-env.sh
In Yarn-site.xml file:
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>1024</ Value>
</property>
In yarn-env.sh file:
java_heap_max=-xmx1024m
The MEMORY-MB value in Yarn-site.xml should be guaranteed to be smaller than the Java_heap_max value in yarn-env.sh. The Yarn-site.xml configuration is the minimum amount of memory required for NodeManager to start, which is below this value. When actually started, use the configuration in yarn-env.sh. Modifications such as: java_heap_max=-xmx2048m
no xxx to stop
Hadoop will often have this problem, presumably is not found the process of the PID file, so the error.
See the connection specifically:
Resolve no Namenode to stop exception when turning off Hadoop
Each time Hadoop (./start-all.sh) is started, the PID file is generated and the process number is stored. When closed, the PID file is deleted.
In the hadoop2.7.1 version, about Hadoop_pid_dir (file path:.. The description of/etc/hadoop/hadoop-env.sh) is as follows:
# The directory where PID files are stored. /tmp by default.
# Note:this should is set to a directory, that can and only is written to, the user that'll run the Hadoop daemo NS. Otherwise there is the
# potential for a symlink attack.
Export Hadoop_pid_dir=${hadoop_pid_dir}
export Hadoop_secure_dn_pid_dir=${hadoop_pid_dir}
It is best to place the PID file in a write-only directory.
about Mapred-site.xml Configuration
See blog: How to configure memory for MapReduce jobs running in yarn
reference materials
-How to configure memory for the MapReduce job running in yarn
-Hadoop HDFs common file operation commands
-HDFS File Operation command