MapReduce 2.x programming Series 1 builds a basic Maven project, mapreducemaven
This is a maven project. After mvn 3.2.2 is installed,
mvn --versionApache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 2014-08-12T04:58:10+08:00)Maven home: /opt/apache-maven-3.2.3Java version: 1.7.0_09, vendor: Oracle CorporationJava home: /data/hadoop/data1/usr/local/jdk1.7.0_09/jreDefault locale: en_US, platform encoding: UTF-8OS name: "linux", version: "2.6.18-348.6.1.el5", arch: "amd64", family: "unix"
Run the following command to create a project:
mvn archetype:generate -DgroupId=org.freebird -DartifactId=mr1_example1 -DarchetypeArtifactId=maven-archetype-quickst\art -DinteractiveMode=false
Go to the project directory, open pom. xml, and modify it as follows:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>org.freebird</groupId> <artifactId>mr1_example1</artifactId> <packaging>jar</packaging> <version>1.0-SNAPSHOT</version> <name>mr1_example1</name> <url>http://maven.apache.org</url> <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-core</artifactId> <version>1.2.1</version> </dependency> </dependencies></project>
Delete the test directory.
rm -rf src/test/
Mvn clean compile is compiled.
Why should I use hadoop to run mahout? I used eclipse maven to create a mahout instance and didn't use the pseudo-distributed hadoop created earlier.
Mahout has nothing to do with hadoop. If you have to talk about the relationship, you need to find out what the two are doing.
Mahout is a machine learning algorithm library, which is implemented by some classic machine learning algorithms;
Hadoop is an open-source Distributed Data Processing Engine (for HadoopV1, MapReduce), which is often used for large-scale data processing;
Therefore, some algorithms on Mahout support mapreduce programming model implementation, so they can run on the Hadoop platform;
In this way, you can understand that the two are not at a level, and they play different roles, and the instances you run are indeed not used in Hadoop...
I don't know where to start: I have a little Java Foundation. The company wants us to follow up on the development of information management systems. I want to improve my Java technology.
You need to master the basic servlet knowledge, followed by jsp syntax javascript scripts, including the use of json objects, which are only involved in java web development.