Spark keeps holding the 0.0.0.0: 8030 error when executing job in yarn

Source: Internet
Author: User

Recently, when a new spark task is executed on yarn, an error log is still displayed on the yarn slave node: connection failure 0.0.0.0: 8030.

1 The logs are as below:2 2014-08-11 20:10:59,795 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:80303 2014-08-11 20:11:01,838 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

 

This is very strange, because slave should be linked to port 8030 of the master node when executing the task, and should normally be masterip: 8030.

According to the general idea, first check the configuration file: yarn-site.xml. Check whether yarn. ResourceManager. schedager. Address configured in the configuration is master.

<property>     <name>yarn.resourcemanager.hostname</name>                           <value>master1</value></property>
<property>     <name>yarn.resourcemanager.scheduler.address</name>                           <value>master1:8030</value></property>

After this troubleshooting, restart cluser and the fault persists.

Continue troubleshooting and check the environment variables to see if the yarn-site.xml is not loaded at slave startup. Env | grep Yarn Directly views the environment variables of all yarns. yarn_conf_dir = hadoop_conf_dir is displayed and points to the correct directory path.

So it's strange to continue troubleshooting. If there is no problem in the environment, write hard coding. In the code, write it to the dead:

1 configuration conf = new configuration (); 2 Conf. set ("FS. default. name ", hdfsuri); 3 Conf. set ("FS. HDFS. impl "," org. apache. hadoop. HDFS. distributedfilesystem "); 4 Conf. set ("mapreduce. framework. name "," yarn "); 5 Conf. set ("FS. abstractfilesystem. HDFS. impl "," org. apache. hadoop. FS. HDFS "); 6 Conf. set ("yarn. resourceManager. address ", yarnip +": "+ 8030); // set the RM access location

Re-execute the job and still report an error. This is a bit of airsickness. Please calm down and check the following items:

1. configuration file: Master, slave yarn-site.xml are normal no problem.

2. environment variables: The environment variables of master and slave are normal.

3. Hard-coding is ineffective.

Is it a question about the framework of this province?

Search for 0.0.0.0 In the spark root directory and find that there is a matching in a spark dependent package:

Spark-core-assembly-0.4-SNAPSHOT.jar

Open this jar package with a yarn-default.xml. In this example, 0.0.0.0 is configured. In principle, the configuration file priority should be higher than jar.

I tried it again!

Change 0.0.0.0 to the master IP address, re-package and upload, and execute the job.

Oh my God!

Successful!

Let's take a look at the time. It has been a long time for this problem. Forget it. Go to bed first. The specific problem is to be checked on Monday.

However, it is initially believed that the yarn client will obtain a masterip value when executing the job. If it cannot be obtained, the value in yarn-defalut will be used by default. So the key is to find the value. The source code is not a big problem.

OK, go to bed!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.