MAVEN Package Hadoop project (with third-party jar)Issue background:1 Write the Map-reduce program, use the third-party jar, how to package and submit the project to the server execution.2 mahout in the itembased algorithm, the UID is mapped from string to long.The specific features I've implemented here are:The data format for the Mahout itembased algorithm is:
If you execute the following command:
cd/home/hadoop/
Hadoop jar./test/wordcount/wordcount.jar org.codetree.hadoop.v1.wordcount/test/chqz/input/test/chqz/output
So what exactly does this command do inside?
1, first, in the ${hadoop_home}/bin/hadoop script we can see the following code:
Since this is $starting_s
Error 1. [[emailprotected] hadoop]# ant-dversion=1.2.1 Examples Error: The main class could not be found or could not be loaded Org.apache.tools.ant.launch.Launcher Solution Export classpath=.:${java_home}/lib:${jre_home}/lib:/usr/share/ant/lib/ Ant-launcher.jar Error 2:build Failed/usr/local/hadoop/build.xml:634:execute failed:java.io.IOException:Cannot run Program ' autoreconf ' (in Directory "/usr/local/
A Brief introductionDebugging HADOOP2 code on eclipse under Windows, So we configured the Hadoop-eclipse-plugin-2.6.0.jar plugin under Windows Eclipse, and there was a series of problems running the Hadoop code, and it took days to finally run the code. Next we look at the problem and how to solve it, providing the same problems as I have encountered as a referen
First of all, the environment, there are two clusters, a new one of the old, is going to put the new debugging good then turn the old off.NEW: Cloudera Express 5.6.0,cdh-5.6.0Old: Cloudera Express 5.0.5,cdh-5.0.5A problem was found during the new cluster setup, the following command was used to create an index to the Lzo file, the job could not be committed to the specified queue in the new cluster, and the same command was normal in the old cluster:Hadoop j
In running Hadoop and hbase some jar packages are not added to the project, causing the class to not be found, the following is a statistical analysis of the problem, as shown in the following table:
Error
The corresponding missing package
Java.lang.ClassNotFoundException:com.google.common.base.Preconditions;
Guava-11.0.2.jar
Java PDF file implementation method [attached PDFRenderer. jar download], export PDFRenderer. jar
This example describes how to implement PDF files in java. We will share this with you for your reference. The details are as follows:
In a recent website, you need to upload a pdf file, display a pdf page, and click the page to read it online. Here PDFRender is used
The first was to clean the package project with a MAVEN build, generate the jar packages, and then run on Hadoop with a classnotfound error.Tip no Redis.jedis.redis found. Error.Cause of error: In the generated jar, no MAVEN dependencies were entered.WORKAROUND: Rebuild the MAVEN project, package a separate program running on MapReduce, package it with the runnab
Tags: hadoop job dependent jarBefore running a task, the task needs to use a third-party jar, about this jar can be packaged when embedded into the package can also be viewed in the hadoop-env.sh script has loaded classpath script statement:For f in $ hadoop_home/contrib/Capacity-schedity/*.
Modifying parameter Additions in hadoop-env.shExport hadoop_heapsize= "4096"Set the maximum allocated JVM memory to 4096, which is typically used in jar packages to execute additional code in addition to the map and reduce, and the memory footprint required for subsequent code is greater than 1g.When the map or reduce process prompts for insufficient memory, you can modify the parameters in the Mapred-site.
Fileoutputformat.setoutputpath (Job, New Path (Out_path));
Submit the job to Jobtracker run Job.waitforcompletion (true); }
}
1. Select the program entry to be packaged in the Eclipse project and click the right button to select Export
2. Click the jar file option in the Java folder
3. Select the Java file to be beaten into a jar package and the output directory of the
Requirements for JDK versions
Hadoop 2.7 and later versions require JDK 7;
Hadoop 2.6 and previous versions support JDK 6;
for the hadoop1.x.x version, you only need to introduce 1 jars:Hadoop-core
for the hadoop2.x.x version, you need to introduce 4 jars:Hadoop-common
Hadoop-hdfs
Hadoop-mapreduce-client-core
Step OneIf not, do not set up the HBase development environment blog, see my next blog.HBase Development Environment Building (Eclipse\myeclipse + Maven) Step one, need to add. As follows:In the project name, right-click,Then, write Pom.xml, here not much to repeat. SeeHBase Development Environment Building (Eclipse\myeclipse + Maven)When you are done, write the code, right.Step two some steps after the HBase development environment is built (export exported
Errors are as followsError:java.lang.RuntimeException:java.lang.ClassNotFoundException:Class Com.zhen.mr.runjob$hotmapper not Found at Org.apache.hadoop.conf.Configuration.getClass (Configuration.java:2154) at Org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass (Jobcontextimpl.java:186) at Org.apache.hadoop.mapred.MapTask.runNewMapper (Maptask.java:742) at Org.apache.hadoop.mapred.MapTask.run (Maptask.java:341) at org.apache.hadoop.mapred.yarnchild$2. Run (Yarnchild.java:163) at java.
problem with Netty jar package conflict with HBase and Elasticsearch in one project
Event:
Use HBase in the same MAVEN project while using ESError after program runs1 java.lang.NoSuchMethodError:io.netty.util.AttributeKey.newInstance (ljava/lang/string;) lio/netty/util /attributekey;Surfing the Internet for a number of reasons, said that the Netty version of the different reasons, their own in the compiled directory also see the differen
Uploading and downloading files on HDFs is the basic operation of the cluster, in the guide to Hadoop, there are examples of code for uploading and downloading files, but there is no clear way to configure the Hadoop client, after lengthy searches and debugging, How to configure a method for using clustering, and to test the available programs that you can use to manipulate files on the cluster. First, you
In the daily Java learning and development, always encounter a variety of jar package download, but csdn this pit daddy site, you yards farmers want to earn some C money, an open source of free things so changed flavor, I gathered here some useful tools, daily development needs to use the Please pick up, after all, I also walked all the way to the pit, I hope to help the later people.
No need to sa
1.hosted, host warehouse, deploy its own jar to this type of warehouse, including releases and snapshot, releases company internal release repository, snapshots Internal testing version of the warehouse 2.proxy, Agent warehouse, used for proxy remote public warehouses, such as the MAVEN central warehouse, the user connected to the central warehouse to download the jar
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.