Hadoop Log Description

Source: Internet
Author: User
Tags syslog hadoop mapreduce
When a beginner runs a mapreduce job, often encounter a variety of errors, due to lack of experience, often unintelligible, generally directly to the terminal printing errors affixed to search engines to find, to learn from previous experience. However, for Hadoop, when encountering an error, the first time should be to view the log, the log in the production will have a detailed error cause prompt, this article will summarize the Hadoop mapreduce log storage location, help beginners to locate their own errors encountered.

The Hadoop mapreduce log is divided into two parts, a service log and a job log, which are described below:

1. Hadoop 1.x version

The service logs for MapReduce in Hadoop 1.x include jobtracker logs and individual tasktracker logs, with their log locations as follows:

Jobtracker: On the Jobtracker installation node, the default location is

${hadoop.log.dir}/logs/*-jobtracker-*.log, the file is generated one day, the old log suffix is the date, the log file suffix of the day is ". Log", where ${hadoop.log.dir} The default value is the HADOOP installation directory, which is ${hadoop_home}.

Tasktracker: On each Tasktracker installation node, the default location is

$HADOOP _home/logs/*-tasktracker-*.log, the file is generated one day, the old log followed by a log, the day's log file suffix is ". Log"

The job log includes both the Jobhistory log and the task log, where the Jobhistory log is the job run log, including the job start time, end time, start time of each task, end time, various counter information, etc. The user can parse out the various information of job running from this log, which is very valuable information. The default storage location is under the ${hadoop.log.dir}/history directory of the node where Jobtracker is located, and can be configured with parameters hadoop.job.history.location. Each task log is stored on the task run node, where the location is the ${hadoop.log.dir}/userlogs/<jobid>/<attempt-id> directory, each task contains three log files, Are stdout, stderr, and Syslog, where stdout is a log printed through the standard output, For example, SYSTEM.OUT.PRINTLN, note that the program in the standard output of the printed log is not directly displayed on the terminal, but in this file, Syslog is printed through the log4j log, usually this log contains the most useful information, but also the most critical reference log error debugging.

2. Hadoop 2.x version

The service logs for yarn systems in Hadoop 2.x include ResourceManager logs and individual NodeManager logs, with their log locations as follows:

ResourceManager log storage location is under the logs directory under the Hadoop installation directory Yarn-*-resourcemanager-*.log

NodeManager Log storage location is under the logs directory under the Hadoop installation directory on each NodeManager node Yarn-*-nodemanager-*.log

The Application log includes jobhistory logs and container logs, where Jobhistory logs are application run logs, including application startup time, end time, start time for each task, end time, various counter information, and so on.

The container log contains applicationmaster logs and normal task logs, which are stored in the Application_xxx directory in the Userlogs directory under the Hadoop installation directory. Where the Applicationmaster log directory name is container_xxx_000001, the normal task log directory name is container_xxx_000002,container_xxx_000003, ...., As with Hadoop 1.x, each directory contains three log files: stdout, stderr, and syslog, with the exact same meaning.

3. Summary

Hadoop log is the most important channel for users to locate problems, for beginners, often do not realize this point, or even aware of this, can not find the log storage location, I hope this article for beginners to help.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.