① Download the latest mahout version from the official website and put it in the/usr/local/directory of the Linux local system. Unzip the package.
Tar-zxvf mahout-distribution-0.9.tar.gz
② Rename the decompressed folder as mahout
MV mahout-distribution-0.9 mahout
③ ExecutionVI/etc/profileConfigure the mahout environment as follows:
④ ExecutionSource/etc/profileMake the configuration file effective immediately
⑤ Download the test data for testing:
Download a file synthetic_control.data, http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
Upload the downloaded file to the HDFS/User/root/testdata/Directory (Note: I am using the root user to log on)
⑥ Use the means Algorithm for testing and execution
Hadoop JAR/usr/local/mahout/mahout-examples-0.9-job.jar org. Apache. mahout. Clustering. syntheticcontrol. kmeans. Job
In a short time, the HDFS file system generates classified data under the/user/root/output directory.