Installation of Mahout
Mahout is an advanced application of Hadoop. Running mahout requires a pre-installed hadoop,mahout to install only one of the Namenode nodes on the Hadoop cluster, no installation on other data nodes
1. Download
2. Configure Environment variables
3.Mahout--help
Check if the mahout is properly installed to see if some algorithms are listed
Of course, this method is not accurate and can be verified by the next steps.
4.mahout Use Preparation
A. Download a file Synthetic_control.data,http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_ Control.data, and put the file in the $mahout_home directory.
b. Check the Hadoop status to start Hadoop
C.
C. Create the test catalog TestData and import the data into this Tastdata directory (the name of the directory is only testdata)
[Email protected]:~/$ hadoop fs-mkdir TestData #
[Email protected]:~/$ Hadoop fs-put/home/hadoop/mahout-distribution-0.7/synthetic_control.data testdata
D. using the Kmeans algorithm (this will run for a few minutes or so)
[Email protected]:~/$ Hadoop Jar/home/hadoop/mahout-distribution-0.7/mahout-examples-0.7-job.jar Org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
E. Viewing results
[Email protected]:~/$ hadoop FS-LSR Output
If you see the following results then the algorithm runs successfully and your installation succeeds.
Clusteredpoints clusters-0 clusters-1 clusters-10 clusters-2 clusters-3 clusters-4 clusters-5 clusters-6 clusters- 7 clusters-8 clusters-9 Data
Installation of Mahout