At first I was on the Win7 64-bit remote connection to Hadoop to run the WordCount program, but this always required the network, considering this situation, I decided to transfer this environment to Unbuntu
Something to prepare.
A Hadoop jar package, a plug-in that connects to eclipse (which is in the unpacked Jar), a hadoop-core-*.jar (considering the connection permissions issue)
An eclipse of the. tar.gz package (other types of packages can also, eclipse itself does not need to install, here is not much to say)
Because I had built this environment on the Win7, so everything went well, but I still have to record here
1, copy the plugin to Eclipse's plugins directory, the icon shown below will appear, and copy the Hadoop-core-*.jar to the installation directory of Hadoop.
Here's a little hadoop-core-*.jar. The Fileutil class in this jar package restricts permissions, compiles, modifies, and then re-compiles the jar package, which can be used with the anti-compilation tool
I'm not trying here, it's a modified jar package downloaded directly from the Internet.
2. Configure the installation location for Hadoop in eclipse
3, configuring MapReduce in Eclipse
I found 9001 this port does not match, DFS can be connected successfully, but it is better to configure it
UBUNTU1 is the hostname of my running Hadoop, which can also be replaced by an IP address,
After you turn on Hadoop, you can refresh
4, then you can run the WordCount program, there are many examples online, here, I just want to say two points, attention to parameters, such as
Hdfs://192.168.1.200:9000/feng/hello.txt Hdfs://192.168.1.200:9000/feng_out
Note that the output directory must not exist, if there will be an error
At first I did not add the previous IP address, has been an error, I checked on the Internet, I thought it was not enough user rights, to improve the privileges of ordinary users, later found not, is the address of the problem
If there is any mistake, please correct me.
Install Eclipse on Ubuntu and connect Hadoop to run WordCount program