Debug hadoop remotely in intellij idea

Source: Internet
Author: User
Tags hadoop fs

Reprinted please indicate the source, Source Address: http://blog.csdn.net/lastsweetop/article/details/8964520

1. preface Android studio was shocked at the Google I/O 2013 Developer Conference. I did not expect intellij idea to become so powerful. I have always been a loyal fan of Eclipse, but I have already become a fan of intellij idea, decisive download, installation, and debugging are really awesome, but there is no hadoop plug-in, which is a little depressing. Because I recently studied hadoop, I decided to implement remote debugging by myself. All the code content is hosted on GitHub. Address: Success
The project directory is as follows:
2. Step 1: There are already a bunch of SSH configurations on the Internet. I will briefly describe how to execute ssh-keygen-T RSA.
keygen -t rsa

Will be in ~ /. Ssh/id_rsa.pub File

Remotely copy this file to the namenode node through SCP
scp ~/.ssh/id_rsa.pub hadoop@namenode:~/.ssh/

Log on to namenode

ssh hadoop@namenode
Copy the id_rsa.pub file of the development environment to authorized_keys.
cat ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys 

SSH password-less login is complete

3. Step 2: Write the deploy. Sh script.
#!/bin/shecho "deploy jar"scp ../target/styhadoop-ch2-1.0.0-SNAPSHOT.jar hadoop@namenode:~/test/echo "deploy run.sh"scp run.sh hadoop@namenode:~/test/echo "change authority"ssh hadoop@namenode "chmod 755 ~/test/run.sh"echo "start run.sh"ssh hadoop@namenode "~/test/run.sh"

Run. Sh script

#!/bin/shecho "add jar to classpath"export HADOOP_CLASSPATH=~/test/styhadoop-ch2-1.0.0-SNAPSHOT.jarecho "run hadoop task"~/hadoop/bin/hadoop com.sweetop.styhadoop.MaxTemperature   input/  output/

4. Step 3: Configure Pom. XML to run the script using Maven-antrun-plugin, bind it to verify, and then run the lifecycle.

<build>        <plugins>            <plugin>                <groupId>org.apache.maven.plugins</groupId>                <artifactId>maven-antrun-plugin</artifactId>                <version>1.7</version>                <executions>                    <execution>                        <id>hadoop remote run</id>                        <phase>verify</phase>                        <goals>                            <goal>run</goal>                        </goals>                        <configuration>                            <target name="test">                                <exec dir="${basedir}/shell" executable="bash">                                     <arg value="deploy.sh"></arg>                                </exec>                            </target>                        </configuration>                    </execution>                </executions>            </plugin>        </plugins>    </build>

5. HDFS file preparation

[hadoop@namenode test]$hadoop fs -mkdir /user[hadoop@namenode test]$hadoop fs -mkdir /user/hadoop/[hadoop@namenode test]$hadoop fs -put input /user/hadoop/[hadoop@namenode test]$hadoop fs -lsr /usr/hadoop

6. Execution result

test:     [exec] deploy jar     [exec] deploy run.sh     [exec] change authority     [exec] start run.sh     [exec] add jar to classpath     [exec] run hadoop task     [exec] 13/05/23 11:36:28 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.     [exec] 13/05/23 11:36:28 INFO input.FileInputFormat: Total input paths to process : 2     [exec] 13/05/23 11:36:28 INFO util.NativeCodeLoader: Loaded the native-hadoop library     [exec] 13/05/23 11:36:28 WARN snappy.LoadSnappy: Snappy native library not loaded     [exec] 13/05/23 11:36:29 INFO mapred.JobClient: Running job: job_201305032210_0003     [exec] 13/05/23 11:36:30 INFO mapred.JobClient:  map 0% reduce 0%     [exec] 13/05/23 11:36:46 INFO mapred.JobClient:  map 100% reduce 0%     [exec] 13/05/23 11:37:04 INFO mapred.JobClient:  map 100% reduce 100%     [exec] 13/05/23 11:37:09 INFO mapred.JobClient: Job complete: job_201305032210_0003     [exec] 13/05/23 11:37:09 INFO mapred.JobClient: Counters: 29     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   Job Counters      [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Launched reduce tasks=1     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=19771     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Launched map tasks=2     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Data-local map tasks=2     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=13494     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   File Output Format Counters      [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Bytes Written=8     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   FileSystemCounters     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     FILE_BYTES_READ=131296     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     HDFS_BYTES_READ=1777394     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=327106     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=8     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   File Input Format Counters      [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Bytes Read=1777168     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   Map-Reduce Framework     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map output materialized bytes=131302     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map input records=13130     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce shuffle bytes=65656     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Spilled Records=26258     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map output bytes=105032     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     CPU time spent (ms)=6030     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Total committed heap usage (bytes)=379518976     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Combine input records=0     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     SPLIT_RAW_BYTES=226     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce input records=13129     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce input groups=1     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Combine output records=0     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Physical memory (bytes) snapshot=469196800     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce output records=1     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1723944960     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map output records=13129

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.