Hadoop Log File

Last Update:2014-10-07 Source: Internet

Author: User

Tags syslog hadoop mapreduce

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When running mapreduce jobs, beginners often encounter various errors, often on the cloud. Generally, they directly paste the errors printed on the terminal to the search engine for help.

For hadoop, when an error occurs, you should first check the log, and the general production in the log will have a detailed error cause prompt. Hadoop mapreduce logs are divided into two parts:Service logs, In partJob log, The details are as follows:

1. hadoop 1.x

Mapreduce in hadoop 1. xService logsIncluding jobtracker logs and tasktracker logs. Their log locations are as follows:

Jobtracker: On the jobtracker installation node, the default location is

$ {Hadoop_home}/logs/*-jobtracker-*. log. This file is generated every day. The old log suffix is date, and the log file Suffix of the current day is ". log ".

Tasktracker: on each tasktracker installation node, the default location is
$ {Hadoop_home}/logs/*-tasktracker-*. log. This file is generated every day. The old log will be followed by a log. The log file Suffix of the day is ". log"

Job logIncluding the jobhistory log and task log. The jobhistory log is the job running log, including the job start time, end time, start time, end time, and various counter information, users can parse various information about job running from this log, which is very valuable. The default storage location is $ {hadoop_home}/logs/history directory of the node where jobtracker is located. You can configure it by using the hadoop. Job. History. location parameter. Each task log is stored on the task running node at $ {hadoop_home}/userlogs //. Each task contains three log files: stdout, stderr, and syslog, stdout is a log printed by standard output, such as system. out. println, note that the logs printed by standard output in the program are not directly displayed on the terminal, but saved in this file. Syslog logs are printed through log4j, this log usually contains the most useful information and is also the most critical reference log for error debugging.

2. hadoop 2.x version

In hadoop 2.x, yarn system service logs include ResourceManager logs and nodemanager logs. Their log locations are as follows:
The log storage location of ResourceManager is yarn-*-ResourceManager-*. log under the logs directory under the hadoop installation directory.
Nodemanager logs are stored in yarn-*-nodemanager-*. log under the logs directory under the hadoop installation directory on each nodemanager node.
Application logs include the jobhistory log and the container log. The jobhistory log is the application program running log, including the application startup time, end time, start time, and end time of each task, various counter information.
The container log contains applicationmaster logs and common task logs, which are stored in the application_xxx directory in the userlogs directory under the hadoop installation directory. The applicationmaster log directory is named container_xxx_000001, the common task log directory name is container_xxx_000002, container_xxx_000003 ,...., Like hadoop 1.x, each directory contains three log files: stdout, stderr, and syslog, which have the same meanings.

3. Summary
Hadoop logs are the most important channel for users to locate problems. For Beginners, they often do not realize this, or even realize this, they cannot find the log storage location, I hope this article will help beginners.

Hadoop Log File

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More