A Java-written calculation program connects the hive database through JDBC and executes SQL query statements to query the data. Results after the corresponding MapReduce task is executed to map 18% reduce xx%, all progress outputs are changed to map 0% reduce 0%
After troubleshooting the Java statement, kill the job, because the job log is not available, try to rerun the program directly
The MapReduce task created by the new hive query is blocked in the accepted state and cannot begin running
Carefully examined the Web page of Hadoop, found that the current cluster of 5 nodes have become unhealthy state, the available memory and VCore are 0, presumably should be insufficient resources to cause the task cannot start execution
With "Hadoop unhealthy" as the key words, Baidu to the Web page, some of the practice is to restart yarn NodeManager and ResourceManager, but has done several restart operation, and no effect
Also found an article, mentioned unhealthy reason is bad file, then view the unhealthy cause of the cluster, found also bad file (including Local-file and log-file), determined to be the remaining storage space is insufficient, Causes yarn to mark five nodes as unavailable, so the entire cluster is in a state where no resources can perform the task
Cluster of 5 nodes, using a disk scale of more than 90%, or even more, due to temporarily unable to transfer and delete operations, found a temporary solution-improve node health Check the maximum disk occupancy, reference link: http://stackoverflow.com/ Questions/29010039/yarn-unhealthy-nodes
Change the Yarn-site.xml file in the $hadoop_home/etc/hadoop/directory to add the property:
< Property> <name>Yarn.nodemanager.disk-health-checker.min-healthy-disks</name> <value>0.0</value> </ Property> < Property> <name>Yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name> <value>99.0</value> </ Property>
In order to slightly reserve space for each node disk, the maximum value here is set to 99%.
After the change, perform stop-yarn.sh and start-yarn.sh, restart NodeManager and ResourceManager, refresh the Web page in Hadoop (http://namenode:8088//), You can see that the node is back to the Headthy state, with the memory and VCore resources available, and the job can execute normally.
However, the above is a stopgap measure, in order to allow the cluster to run healthily and stably, still need to clean up the cluster disk in time, back up the infrequently used data or supplementary storage device
The MapReduce tasks submitted by hive perform a middle suspend (jumping back from running state to accepted state) problem resolution record