Beginners Run MapReduce homework, often encounter a variety of errors, because of the lack of experience, often unintelligible, the general directly to the terminal printing errors to search engines, to learn from the experience of predecessors. However, for Hadoop, when an error is encountered, http://www.aliyun.com/zixun/aggregation/21263.html "> The first time should be to view the log, the log will have detailed error reason for the production, This article will summarize the Hadoop mapreduce log storage location to help beginners locate errors they encounter.
The Hadoop mapreduce log is divided into two parts, a service log and a job log, which are described as follows:
1. Hadoop 1.x version
The MapReduce service log in Hadoop 1.x includes jobtracker logs and individual tasktracker logs, and their log locations are as follows:
Jobtracker: On the Jobtracker installation node, the default location is
${hadoop.log.dir}/logs/*-jobtracker-*.log, the file is generated one day, the old log suffix is the date, the day of the log file suffix is ". Log", where ${hadoop.log.dir} The default value is the HADOOP installation directory, or ${hadoop_home}.
Tasktracker: On each Tasktracker installation node, the default location is
$HADOOP _home/logs/*-tasktracker-*.log, the file is generated one day, the old log followed by a log, the day of the log file suffix is ". Log"
The job log includes both Jobhistory and task logs, where the Jobhistory log is the job run log, including job start time, end time, start time for each task, end time, various counter information, etc. The user can parse out the various information of the job running from this log, it is very valuable information. The default storage location is the ${hadoop.log.dir}/history directory of the node where Jobtracker resides, and can be configured through parameters hadoop.job.history.location. Each task log is stored on the task run node, in the ${hadoop.log.dir}/userlogs//directory, and each task contains three log files, respectively stdout, stderr, and Syslog, where StdOut is a log that is printed through standard output, such as SYSTEM.OUT.PRINTLN, note that the program printed through the standard output of the log is not directly displayed on the terminal, but saved in this file, Syslog is printed through the log4j log, usually the log contains the most useful information, but also error debugging the most critical reference log.
2. Hadoop 2.x version
The service log for the yarn system in Hadoop 2.x includes ResourceManager logs and individual NodeManager logs, and their log locations are as follows:
The ResourceManager log storage location is the Yarn-*-resourcemanager-*.log under the logs directory in the Hadoop installation directory
The NodeManager log storage location is the Yarn-*-nodemanager-*.log under the logs directory of the Hadoop installation directory on each NodeManager node.
The application log includes the Jobhistory log and the container log, where the Jobhistory log is an application run log, including application startup time, end time, start time for each task, end time, various counter information, and so on.
The container log contains applicationmaster logs and common task logs, which are stored in the Application_xxx directory in the Userlogs directory under the Hadoop installation directory. Where the Applicationmaster log directory name is container_xxx_000001, the common task log directory name is container_xxx_000002,container_xxx_000003, ..., Like Hadoop 1.x, each directory contains three log files: stdout, stderr, and syslog, and the exact meaning is the same.
3. Summary
Hadoop log is the most important channel for users to locate problems, for beginners, often not aware of this, or even aware of this, can not find the location of log storage, I hope this article is helpful to the beginner.
Original link: http://dongxicheng.org/mapreduce-nextgen/hadoop-logs-placement/
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.