Step 5: test the spark IDE development environment
The following error message is displayed when we directly select sparkpi and run it:
The prompt shows that the master machine running spark cannot be found.
In this case, you need to configure the sparkpi execution environment:
Select Edit configurations to go to the configuration page:
In program arguments, enter "local ":
This configuration i
Next package, use Project structure's artifacts:Using the From modules with dependencies:Select Main Class:Click "OK":Change the name to Sparkdemojar:Because Scala and spark are installed on each machine, you can delete both Scala and spark-related jar files:Next Build:Select "Build Artifacts":The rest of the operation is to upload the jar package to the server, and then execute the
Reason:Running the spark code with the root userWorkaround: Run spark with a non-administrator account[[Email protected] Bin]$./Add-User.ShWhatType of userDoYou wish to add?A) Management User (Mgmt-Users.Properties)B) Application User (Application-Users.Properties)(A):BEnterThe details of theNewUser to add.Realm (Applicationrealm) : Applicationrealm ---->> Careful Here . YouNeed to typeThisor leave it blank
/49/D5/wKioL1QbpNKDWXo_AAElnZLjV4U229.jpg "style =" float: none; "Title =" 14.png" alt = "wkiol1qbpnkdwxo_aaelnzljv4u229.jpg"/>
Select "yes" to enable automatic installation of scala plug-in idea.
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M00/49/D3/wKiom1QbpLijqttNAAE3LTevJ5I077.jpg "style =" float: none; "Title =" 15.png" alt = "wkiom1qbplijqttnaae3ltevj5i077.jpg"/>
In this case, it takes about 2 minutes to download and install the SDK. Of course, the download time varies depen
; "src =" http://s3.51cto.com/wyfs02/M02/4A/13/wKioL1QiJJPzxOm0AAFxk_FS8AU762.jpg "style =" float: none; "Title =" 51.png" alt = "wkiol1qijjpzxom0aafxk_fs8au762.jpg"/>
We found that we fully used the new background and correctly ran the program, which is much faster than the first operation.
This article is from the spark Asia Pacific Research Institute blog, please be sure to keep this source http://rockyspark.blog.51cto.com/2229525/1557591
[
Share with you what spark is? How to analyze data with spark, and small partners who are interested in big data to learn about it.Big Data Online LearningWhat is Apache Spark?Apache Spark is a cluster computing platform designed for speed and general purpose.From a speed point of view,
Save and run the source command to make the configuration file take effect.
Step 3: Run idea and install and configure the idea Scala development plug-in:
The official document states:
Go to the idea bin directory:
Run "idea. Sh" and the following page appears:
Select "Configure" To Go To The idea configuration page:
Select plugins To Go To The plug-in installation page:
Click the "Install jetbrains plugin" option in the lower left corner to go to the following page:
Enter "Scala"
Modify the source code of our "firstscalaapp" to the following:
Right-click "firstscalaapp" and choose "Run Scala console". The following message is displayed:
This is because we have not set the JDK path for Java. Click "OK" to go to the following view:
In this case, select the "project" option on the left:
In this case, we select "new" of "No SDK" to select the following primary View:
Click the JDK option:
Select the JDK directory we installed earlier:
Click "OK"
Click OK:
Click the f
-site.xml configuration can refer:
Http://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
Step 7 modify the profile yarn-site.xml, as shown below:
Modify the content of the yarn-site.xml:
The above content is the minimal configuration of the yarn-site.xml, the content of the yarn-site.xml file configuration can be referred:
Http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
[
Label: style blog http OS Using Ar Java file sp Download the downloaded"Hadoop-2.2.0.tar.gz "Copy to"/Usr/local/hadoop/"directory and decompress it: Modify the system configuration file ~ /Configure "hadoop_home" in the bashrc file and add the bin folder under "hadoop_home" to the path. After modification, run the source command to make the configuration take effect. Next, create a folder in the hadoop directory using the following command: Next, modify the hadoop configuration file. F
Label: style blog http OS use AR file SP 2014
7. perform the same hadoop 2.2.0 operations on sparkworker1 and sparkworker2 as sparkmaster. We recommend that you use the SCP command to copy the hadoop content installed and configured on sparkmaster to sparkworker1 and sparkworker2;
8. Start and verify the hadoop distributed Cluster
Step 1: format the HDFS File System:
Step 2: Start HDFS in sbin and execute the following command:
The startup process is as follows:
At this point, we
Copy the downloaded hadoop-2.2.0.tar.gz to the "/usr/local/hadoop/" directory and decompress it:
Modify the system configuration file ~ /Configure "hadoop_home" in the bashrc file and add the bin folder under "hadoop_home" to the path. After modification, run the source command to make the configuration take effect.
Next, create a folder in the hadoop directory using the following command:
Next, modify the hadoop configuration file. First, go to the hadoop 2.2.0 configuration file area:
Download the downloaded"Hadoop-2.2.0.tar.gz "Copy to"/Usr/local/hadoop/"directory and decompress it: Modify the system configuration file ~ /Configure "hadoop_home" in the bashrc file and add the bin folder under "hadoop_home" to the path. After modification, run the source command to make the configuration take effect. Next, create a folder in the hadoop directory using the following command: \Next, modify the hadoop configuration file. First, go to the hadoop 2.2.0 configuration file
Tags: android style http color java using IO strongLiaoliang Spark Open Class Grand forum Phase I: Spark has increased the speed of cloud computing big data by more than 100 times times http://edu.51cto.com/lesson/id-30816.htmlSpark Combat Master Road Series Books http://down.51cto.com/tag-Spark%E6%95%99%E7%A8%8B.htmlLiaoliang Teacher (email [email protected] pho
Http://www.cnblogs.com/shishanyuan/archive/2015/08/19/4721326.html
1, spark operation structure 1.1 term definitions
LApplication: The Spark application concept is similar to that of the Hadoop mapreduce, which refers to a user-written Spark application that contains a driver Functional code and executor code that runs on multiple nodes in a cluster;
LDrive
file. This dataset has 4 columns per row, corresponding to the user ID, item ID, score, and timestamp. Because my machine is broken, in the following example, I only used the first 100 data. So if you use all the data, the predictions will be different from mine.First you need to make sure that you have Hadoop and spark installed (not less than 1.6) and that you have set up environment variables. Generally we are studying in Ipython notebook (
Tag: blog http OS file 2014 Art
Preface:
Spark has been very popular recently. This article does not talk about spark principles, but studies how to compile spark cluster construction and service scripts. We hope to understand spark clusters from the perspective of running scripts.
prefixspan algorithm sieve apart too long frequent sequences. In the distributed big Data environment, it is necessary to consider the data block number numpartitions of the fpgrowth algorithm and the number of items in the largest single projection database of the Prefixspan algorithm maxlocalprojdbsize.3. Example of Spark FP tree and Prefixspan algorithm useHere we use a concrete example to demonstrate how to use the
Tags: AOP org jmx example init exec 2.0 lines www.1. Prepare for Work 1.1 install spark and configure spark-env.shYou need to install spark before using Spark-shell, please refer to http://www.cnblogs.com/swordfall/p/7903678.htmlIf you use only one node, you can not configure the slaves file, the
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.