Mahout 0.10.1 Installation (Hadoop2.6.0) and Kmeans test

Source: Internet
Author: User
Tags hadoop fs

1. Version and Installation path

Ubuntu 14.04

mahout_home=/opt/mahout-0.10.1

Hadoop_home=/usr/local/hadoop

mavent_home=/opt/apache-mavent-3.3.3

Hadoop version=2.6.0

Mahout version=0.10.1

Mavent version=3.3.3

2.Mahout Recompile

Mahout Download: http://archive.apache.org/dist/mahout/

Need to recompile when used on Hadoop above version 2.0

$ git clone https://github.com/apache/mahout.git$ mvn Clean package-dhadoop2-dhadoop2.version=2.6.0-dskiptests=trueCompiled after the compilation is complete\mahout\examples\target\mahout-examples-snapshot-0.10.1.jar\mahout\examples\target\mahout-examples-snapshot-0.10.1-job.jarreplace Mahout-examples-0.10.1.jar in Mahout directory, mahout-examples-0.10.1-job.jar two files3. Environment Variables
sudo gedit ~/.BASHRC

  

#MahoutHADOOP_HOME =/usr/local/hadoophadoop_conf_dir= $HADOOP _home/etc/hadoopmahout_home=/opt/ mahout-0.10.1mahout_conf_dir= $MAHOUT _home/confpath= $PATH: $HADOOP _home/bin: $MAHOUT _home/bin#mavenmaven_home=/ Opt/apache-maven-3.3.3export Maven_homeexport Path=${path}:${maven_home}/bin

The installation path should be consistent with your own

environment variable changes take effect immediately:

SOURCE ~/.BASHRC

Run the command under the Mahout installation path: Mahout, the installation is successful.

4.kmeans Simple Operation

Download test Data Set Synthetic_control.data

http://archive.ics.uci.edu/ml/databases/synthetic_control/

Create the TestData directory in HDFs, it must be the testdata directory! And every time you run Hadoop, delete the original output directory!

Bin/hadoop fs-mkdir-p TestData

Uploading to the testdata directory in HDFs

Hadoop fs-copyfromlocal/home/hadoop/Desktop/synthetic_control.data testdata

Start Kmeans in the Mahout installation directory

Mahout Org.apache.mahout.clustering.syntheticcontrol.kmeans.Job

Results:

To view the output directory:

Under Eclipse

  

Mahout 0.10.1 Installation (Hadoop2.6.0) and Kmeans test

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.