Spark Tutorial - Building a Spark Cluster - Configuring Hadoop Pseudo Distribution Mode and Running a Wordcount Example (1)

Source: Internet
Author: User
Keywords nbsp ; run xml dfs
Tags address aliyun configuration configure create directory distributed distribution
& http: //www.aliyun.com/zixun/aggregation/37954.html "> nbsp; pseudo-distribution model involves the following configuration information:

Modify the Hadoop core configuration file core-site.xml, the main configuration HDFS address and port number;

Hdfs HDFS configuration file to modify hdfs-site.xml, the main configuration replication;

Modify Hadoop's MapReduce configuration file mapred-site.xml, mainly to configure JobTracker address and port;

In the specific operation before we first create a few folders in the Hadoop directory:

Let's start building and testing a specific pseudo-distributed process:

First configure the core-site.xml file:

Enter the core-site.xml file:

After configuration, the contents of the file are as follows:

Use the ": wq" command to save and exit.

Next configure hdfs-site.xml, open the file:

Open the file:

After the configuration file:

Enter ": wq" to save the changes and exit.

Next modify the mapred-site.xml configuration file:

Enter the configuration file:

The content of the modified mapred-site.xml configuration file is:

Use the ": wq" command to save and exit.

Through the above configuration, we completed the simplest pseudo-distributed configuration.

Next hadoop namenode formatting:

Enter "Y" to complete the formatting process:

Next start Hadoop!

Start Hadoop as follows:

Using java comes with jps command to find out all the daemons:

Start Hadoop! ! !

Then use Hadoop for monitoring the status of the cluster Web page to view the health of Hadoop, the specific page is as follows:

http: // localhost: 50030 / jobtracker.jsp

http: // localhost: 50060 / tasttracker.jsp
http: // localhost: 50070 / dfshealth.jsp

The above Hadoop health monitoring page shows that our pseudo-distributed development environment is fully set up!

Next we use the new pseudo-distributed platform to run the wordcount program:

First create the input directory in dfs:

The file created at this time because there is no specified hdfs specific directory, it will create the "input" directory under the current user "rocky" to view the Web console:

Perform a file copy operation

Click here to continue: Spark Tutorial - Building a Spark Cluster - Configuring Hadoop Standalone Mode and Running Wordcount (2)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.