Apache Spark Source Code go-18-use intellij idea to debug Spark Source Code

Last Update:2014-07-18 Source: Internet

Author: User

Tags arch linux

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

You are welcome to reprint it. Please indicate the source, huichiro.

Summary

The previous blog shows how to modify the source code to view the call stack. Although it is also very practical, compilation is required for every modification, which takes a lot of time and is inefficient, it is also an invasive modification that is not elegant. This article describes how to use intellij idea to track and debug spark source code.

Prerequisites

This document assumes that the development environment is on the Linux platform and the following software has been installed. I personally use arch Linux.

JDK
Scala
SBT
Intellij-idea-Community-Edition

Install Scala plug-in

To install the scala plug-in for idea, follow these steps:

Select File> setting.

2 Step 2: selectInstall jetbrains plugin,Enter Scala on the left side of the pop-up window and click Install, as shown in

3. Scala plug-in installation is complete. Restart idea to take effect.

Because idea 13 already supports SBT, you do not need to install the SBT plug-in for idea.

Download and import source code

Download the source code. Assume that you use git to synchronize the latest source code.

git clone https://github.com/apache/spark.git

Generate an idea Project

sbt/sbt gen-idea

Import Spark Source Code

1. Select File-> Import project and specify the Spark Source Code directory in the pop-up window.

2. Select SBT project as the project type and click Next

3. Click Finish in the new pop-up window.

After the import settings are complete, it takes a long time for idea to compile the imported source code and generate a file index.

If the following message "is waiting for. SBT. Ivy. Lock" appears in the prompt bar, the lock file cannot be created and needs to be deleted manually.

cd $HOME/.ivy2rm *.lock

After the lock is manually deleted, restart idea. After the lock is restarted, the last incomplete SBT process will continue.

Source code compilation

When using idea to compile spark source code, there will be multiple errors in the middle. The root cause of the problem is that the dependency is not well resolved when SBT/SBT gen-idea is used.

The solution is as follows,

1. Select File> project structures.

2. Add a new module to dependencies on the right,

Select spark-core

Other modules such as streaming-Twitter, streaming-Kafka, streaming-flume, and streaming-mqtt have similar solutions.

Note that the processing for errors reported by example compilation is slightly different. When dependencies is specified, the module dependency is selected instead of the library, and the SQL is selected in the pop-up window.

For how to solve the compilation error problem, you can look at this link, http://apache-spark-user-list.1001560.n3.nabble.com/Errors-occurred-while-compiling-module-spark-streaming-zeromq-IntelliJ-IDEA-13-0-2-td1282.html

Debug logquery

1. Select Run-> edit deployments.

2. Add an application. Pay attention to the configuration items in the window on the right, including main class, Vm options, working directory, and use classpath of module.

-Dspark. Master = Local specifies the spark running mode, which can be modified as needed.

3. At this point, you can find a "run logquery" item in the run menu and try to run it to ensure the compilation is successful.

4. Set the breakpoint. Double-click on the left side of the source file to mark the breakpoint, and click Run-> "Debug logquery". As shown in, you can view the variables and call stacks.

Reference

Http://8liang.cn/intellij-idea-spark-development

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More