With nearly 100 million rows of data being counted in Hadoop recently, the following exception was thrown during the operation of Hadoop, because the columns of each row plus the number of overall statistics counter exceeded 120:
Org.apache.hadoop.mapreduce.counters.LimitExceededException:Too many counters:121 max=120
Since the configuration of Hadoop could not be modified (because many people are using it) to resolve this exception, I tried the following method to resolve:
1. Add changes to the contents of the configuration in conf profile Job-local.xml
<property>
<name>mapreduce.job.counters.limit</name>
<value>200</value>
</property>
When running, add this parameter: ***********-conf job-local.xml, after running or throw the above Limitexceededexception exception, but in the program output Con.get (" Mapreduce.job.counters.limit ") has changed from 120 to 200, stating that the parameter has been set to con, but it does not work.
result : Failed
2. Set Mapreduce.job.counters.limit directly in the program
Con.set ("Mapreduce.job.counters.limit", "$");
....
....
Logger.info (Con.get ("Mapreduce.job.counters.limit"));
The result output is already 200, but the above Limitexceededexception exception is thrown after running
result : Failure, Method 1 and Method 2 have the same setup procedure and result, but it doesn't work
3. In the Hadoop configuration file mapred-default.xml The following, see more blog: http://blog.csdn.net/xin_jmail/article/details/24086919, But I said that because many projects are using Hadoop clusters, it's not possible to modify the configuration of the entire Hadoop cluster for my reasons
<property>
<name>mapreduce.job.counters.limit</name>
<value>120</value> <description>limit on the number of
counters allowed per job. </description>
</property>
Results: Pseudo failure
4. Modify the program, or reduce the counter (temporary method, ultimately not meet the needs), or the results of mapper put in a file, then reduce to statistics and read the file, see my other blog "Hadoop Map Reduce the number of counter over the default value of 120 solution
result : can achieve
Knowledge points :
1. Mapreduce.job.counters.max has replaced the mapreduce.job.counters.limit, but considering compatibility, both can be used, representing a value
2. The value of Mapreduce.job.counters.limit (or Mapreduce.job.counters.max) cannot be modified at the job level, this should be a bug, a Hadoop mail list has been mentioned, But the resolution state is won ' t fix, because I am marking this JIRA as won ' t fix. We can consider re-opening.it if you propose a compelling use case
See Url:http://markmail.org/message/gljicmpbklazzsb6 for details
do not know the latest version of Hadoop (we now use the version is 2.4.0-mdh2.0.5) whether fix this bug, if someone knows the message please give me a message, thank you.