IntroductionPreviously successfully patch the Mahout0.9 on the server, enabling it to support Hadoop2.2.0. Today's demand is: in the Win7+eclipse+maven environment to develop the Mahout program, hit the jar pack on the cluster, so that it runs normally under Hadoop2.2.0.
Process
Step One: Create a maven project under EclipsePom.xml:1. Introducing Mahout Dependencies <dependencies>
<dependency> <groupId>org.apache.mahout</groupId> <artifactid>
mahout-core</artifactid> <version>0.9</version> </dependency>
</dependencies>
2. Add dependencies to the jar package
<build> <plugins>
<plugin> <artifactid>
maven-assembly-plugin</artifactid> <configuration> <archive> <manifest> <mainClass>cn.fulong.bigdata.ItemCFHadoop</mainclass> </manifest> </archive> <descriptorrefs> <descriptorref>jar-with-dependencies</descriptorref ; </descriptorrefs> </configuration> </plugin>
</plugins> </build>
Step Two: The key- to the cluster on the patch, compiled Mahout-core-0.9.jar and Mahout-math-0.9.jar cover the Windows maven warehouse corresponding files! I tried to copy a patch pom file to Windows, and then compile the mahout0.9 source code in the Windows environment, but it doesn't work, all kinds of errors. Since Mahout-core relies on only two mahout-related jar packages,Mahout-core-0.9.jar and Mahout-math-0.9.jar , so we only need to overwrite the two jar packages that support Hadoop2.2.0 on the cluster locally. If you do not perform this step, the project will report a Hadoop compatibility exception when the jar is copied to the cluster.
Step Three: Packaging, under the Windows environment under the Project root directory execution command:mvn assembly:assembly The generated jar package is under the project root directory/target/, and the name is similar to Xxxxx-jar-with-dependencies.jar
Step four: Copy the jar package to the cluster for executionAttention:
using Hadoop jar executionInstead of Java jar execution!use Hadoop jar execution to find the relevant Hadoop resources successfully.
"Gandalf" Win7+eclipse+maven mahout programming to make it compatible with Hadoop2.2.0 environments