Question 1:
Java.lang.IncompatibleClassChangeError:Found interface Org.apache.hadoop.mapreduce.JobContext, but class was Expectedat org.apache.mahout.common.HadoopUtil.getCustomJobName (Hadooputil.java:174) at Org.apache.mahout.common.AbstractJob.prepareJob (Abstractjob.java:614) at Org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run (Preparepreferencematrixjob.java: ) at Org.apache.hadoop.util.ToolRunner.run (Toolrunner.java:70)
Hadoop does not yet support HADOOP1 Issue 2:
[[email protected] ~]# mahout Hadoop jar Jar.jar Cmd.client.app.ClientAPPMAHOUT_LOCAL is not set; Adding Hadoop_conf_dir to Classpath. Running on Hadoop, Using/usr/lib/hadoop/bin/hadoop and Hadoop_conf_dir=/etc/hadoop/confslf4j:class path contains Multiple SLF4J bindings. Slf4j:found binding in[jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/ Staticloggerbinder.class]slf4j:found binding in[jar:file:/usr/local/hbase-0.98.8-hadoop2/lib/ Slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/staticloggerbinder.class]slf4j:found binding in[jar:file:/usr/local/ Mahout0.9/mahout-core-0.9-job.jar!/org/slf4j/impl/staticloggerbinder.class]slf4j:see http://www.slf4j.org/ Codes.html#multiple_bindings for an explanation. Slf4j:actual binding is of type [org.slf4j.impl.log4jloggerfactory]14/12/0600:10:32 INFO common. Abstractjob:command line Arguments:{--booleandata=[true],--endphase=[2147483647],--input=[hdfs:// 192.168.1.170:8020/USER/ROOT/USERCF],--maxprefsinitemsimilarity=[500],--maxprefsPERUSER=[10],--maxsimilaritiesperitem=[100],--minprefsperuser=[1],--numrecommendations=[10],--output=[hdfs:// 192.168.1.170:8020/user/root/usercf/result/],--similarityclassname=[ Org.apache.mahout.math.hadoop.similarity.cooccurrence.measures.EuclideanDistanceSimilarity],--startphase=[0],- -tempdir=[hdfs://192.168.1.170:8020/tmp/1417795832292]}14/12/0600:10:32 INFO Common. Abstractjob:command line Arguments:{--booleandata=[true],--endphase=[2147483647],--input=[hdfs:// 192.168.1.170:8020/USER/ROOT/USERCF],--minprefsperuser=[1],--output=[hdfs://192.168.1.170:8020/tmp/ 1417795832292/preparepreferencematrix],--ratingshift=[0.0],--startphase=[0],--tempdir=[hdfs:// 192.168.1.170:8020/tmp/1417795832292]}14/12/0600:10:33 INFO Configuration.deprecation:mapred.input.dir is Deprecated. Instead, use mapreduce.input.fileinputformat.inputdir14/12/0600:10:33 INFO configuration.deprecation: Mapred.compress.map.output is deprecated. Instead, use mapreduce.map.output.compress14/12/0600:10:33 INFO Configuration.Deprecation:mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir14/12/0600:10:33 INFO client. Rmproxy:connecting to ResourceManager at crxy171.crxy/192.168.1.171:803214/12/0600:10:35 INFO input. Fileinputformat:total input paths to process:114/12/0600:10:35 INFO mapreduce. Jobsubmitter:number of Splits:114/12/0600:10:35 INFO MapReduce. Jobsubmitter:submitting tokens for job:job_1417519292729_003014/12/0600:10:35 INFO impl. yarnclientimpl:submitted application application_1417519292729_003014/12/0600:10:35 INFO mapreduce. Job:the URL to track the job:http://crxy171.crxy:8088/proxy/application_1417519292729_0030/14/12/0600:10:35 INFO Mapreduce. Job:running job:job_1417519292729_003014/12/0600:10:43 INFO MapReduce. Job:job job_1417519292729_0030 running in Uber mode:false14/12/0600:10:43 INFO MapReduce. Job:map 0% reduce 0%14/12/0600:10:49 INFO mapreduce. Job:map 100% reduce 0%14/12/0600:10:56 INFO mapreduce. Job:map 100% reduce 100%14/12/0600:10:56 INFO MapReduce. Job:job job_1417519292729_0030 completed successfully14/12/0600:10:56 INFO mapreduce. Job:counters:49filesystemcountersfile:number of bytes Read=54file:number of bytes Written=182573file:number of Read Operations=0file:number of large read operations=0file:number of write operations=0hdfs:number of bytes Read=326hdfs: Number of bytes Written=187hdfs:number of read Operations=6hdfs:number of large read operations=0hdfs:number of write Oper ations=2jobcounterslaunched map tasks=1launched Reduce tasks=1rack-local map tasks=1total time spent by all maps in Occupi Ed Slots (ms) =3962total time spent by all reduces in occupied slots (ms) =4134total time spent by all map tasks (ms) =3962to Tal time spent by all reduce tasks (ms) =4134total Vcore-seconds taken by all maps Tasks=3962total Vcore-seconds taken by al L reduce tasks=4134total megabyte-seconds taken by all maps Tasks=4057088total Megabyte-seconds taken by all reduce tasks=4 233216map-reduceframeworkmap input Records=21map Output recoRds=21map output Bytes=84map output materialized bytes=46input split bytes=116combine input records=21combine output reco Rds=7reduce input groups=7reduce Shuffle bytes=46reduce input records=7reduce output records=7spilledrecords= 14shuffledmaps=1failedshuffles=0mergedmap OUTPUTS=1GC Time Elapsed (ms) =98CPU time spent (ms) =2330physical memory ( bytes) snapshot=460136448virtual memory (bytes) Snapshot=1795633152total committed heap usage (bytes) = 355467264shuffleerrorsbad_id=0connection=0io_error=0wrong_length=0wrong_map=0wrong_reduce= 0fileinputformatcountersbytesread=210fileoutputformatcountersbyteswritten=18714/12/0600:10:56 INFO Client. Rmproxy:connecting to ResourceManager at crxy171.crxy/192.168.1.171:803214/12/0600:10:56 INFO input. Fileinputformat:total input paths to process:114/12/0600:10:57 INFO mapreduce. Jobsubmitter:number of splits:114/12/0600:10:57 INFO MapReduce. Jobsubmitter:submitting tokens for job:job_1417519292729_003114/12/0600:10:57 INFO impl. yarnclientimpl:submittedApplication application_1417519292729_003114/12/0600:10:57 INFO MapReduce. Job:the URL to track the job:http://crxy171.crxy:8088/proxy/application_1417519292729_0031/14/12/0600:10:57 INFO Mapreduce. Job:running job:job_1417519292729_003114/12/0600:11:08 INFO MapReduce. Job:job job_1417519292729_0031 running in Uber mode:false14/12/0600:11:08 INFO MapReduce. Job:map 0% reduce 0%14/12/0600:11:08 INFO mapreduce. Job:job job_1417519292729_0031 failed with state failed due to:application application_1417519292729_0031 failed 2 times D UE to AM containerfor appattempt_1417519292729_0031_000002 exited with exitcode:1 due to:exception from Container-launch: Org.apache.hadoop.util.shell$exitcodeexception:org.apache.hadoop.util.shell$exitcodeexception:at Org.apache.hadoop.util.Shell.runCommand (shell.java:511) at Org.apache.hadoop.util.Shell.run (shell.java:424) at Org.apache.hadoop.util.shell$shellcommandexecutor.execute (shell.java:656) at Org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExeCutor.launchcontainer (defaultcontainerexecutor.java:195) at Org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call (Containerlaunch.java : +) at Org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call ( containerlaunch.java:81) at Java.util.concurrent.FutureTask.run (futuretask.java:262) at Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1145) at Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:615) at Java.lang.Thread.run ( thread.java:745) Container exited with a Non-zero exit code 1.Failing this attempt: Failing the application.14/12/0600:11:08 INFO MapReduce. Job:counters:0java.io.filenotfoundexception:file does not exist:/tmp/1417795832292/preparepreferencematrix/ Numusers.binat org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf (inodefile.java:65) at Org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf (inodefile.java:55) at Org.apache.hadoop.hdfs.server.namenode.FSNamesyStem.getblocklocationsupdatetimes (fsnamesystem.java:1726) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt (fsnamesystem.java:1669) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations (fsnamesystem.java:1649) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations (fsnamesystem.java:1621) at Org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations (namenoderpcserver.java:482) at Org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations ( clientnamenodeprotocolserversidetranslatorpb.java:322) at Org.apache.hadoop.hdfs.protocol.proto.clientnamenodeprotocolprotos$clientnamenodeprotocol$2.callblockingmethod (Clientnamenodeprotocolprotos.java) at Org.apache.hadoop.ipc.protobufrpcengine$server$protobufrpcinvoker.call ( protobufrpcengine.java:585) at Org.apache.hadoop.ipc.rpc$server.call (rpc.java:1026) at Org.apache.hadoop.ipc.server$handler$1.run (server.java:1986) at Org.apache.hadoop.ipc.sErver$handler$1.run (server.java:1982) at java.security.AccessController.doPrivileged (Nativemethod) at Javax.security.auth.Subject.doAs (subject.java:415) at Org.apache.hadoop.security.UserGroupInformation.doAs ( usergroupinformation.java:1554) at Org.apache.hadoop.ipc.server$handler.run (server.java:1980) at Sun.reflect.NativeConstructorAccessorImpl.newInstance0 (Nativemethod) at Sun.reflect.NativeConstructorAccessorImpl.newInstance (nativeconstructoraccessorimpl.java:57) at Sun.reflect.DelegatingConstructorAccessorImpl.newInstance (delegatingconstructoraccessorimpl.java:45) at Java.lang.reflect.Constructor.newInstance (constructor.java:526) at Org.apache.hadoop.ipc.RemoteException.instantiateException (remoteexception.java:106) at Org.apache.hadoop.ipc.RemoteException.unwrapRemoteException (remoteexception.java:73) at Org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations (dfsclient.java:1140) at Org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks (dfsclient.java:1128) at Org.apache.hadoop.hdfs.DFSClient.getLocAtedblocks (dfsclient.java:1118) at Org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength (dfsinputstream.java:264) at Org.apache.hadoop.hdfs.DFSInputStream.openInfo (dfsinputstream.java:231) at Org.apache.hadoop.hdfs.dfsinputstream.<init> (dfsinputstream.java:224) at Org.apache.hadoop.hdfs.DFSClient.open (dfsclient.java:1291) at org.apache.hadoop.hdfs.distributedfilesystem$3. Docall (distributedfilesystem.java:300) at Org.apache.hadoop.hdfs.distributedfilesystem$3.docall ( distributedfilesystem.java:296) at Org.apache.hadoop.fs.FileSystemLinkResolver.resolve ( filesystemlinkresolver.java:81) at Org.apache.hadoop.hdfs.DistributedFileSystem.open (Distributedfilesystem.java : 296) at Org.apache.hadoop.fs.FileSystem.open (filesystem.java:764) at Org.apache.mahout.common.HadoopUtil.readInt ( hadooputil.java:339) at Org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run (recommenderjob.java:172) at Org.conan.mymahout.recommendation.ItemCFHadoop.toRunByClient (Unknownsource) at CMD.CLIENT.APp. Clientapp.main (Unknownsource) at Sun.reflect.NativeMethodAccessorImpl.invoke0 (Nativemethod) at Sun.reflect.NativeMethodAccessorImpl.invoke (nativemethodaccessorimpl.java:57) at Sun.reflect.DelegatingMethodAccessorImpl.invoke (delegatingmethodaccessorimpl.java:43) at Java.lang.reflect.Method.invoke (method.java:606) at Org.apache.hadoop.util.RunJar.main (runjar.java:212) caused by: Org.apache.hadoop.ipc.RemoteException (java.io.FileNotFoundException): File does not exist:/tmp/1417795832292/ Preparepreferencematrix/numusers.binat Org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf (Inodefile.java : +) at Org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf (inodefile.java:55) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes (fsnamesystem.java:1726) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt (fsnamesystem.java:1669) at Org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations (fsnamesystem.java:1649) at Org.apache.hadoop.Hdfs.server.namenode.FSNamesystem.getBlockLocations (fsnamesystem.java:1621) at Org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations (namenoderpcserver.java:482) at Org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations ( clientnamenodeprotocolserversidetranslatorpb.java:322) at Org.apache.hadoop.hdfs.protocol.proto.clientnamenodeprotocolprotos$clientnamenodeprotocol$2.callblockingmethod (Clientnamenodeprotocolprotos.java) at Org.apache.hadoop.ipc.protobufrpcengine$server$protobufrpcinvoker.call ( protobufrpcengine.java:585) at Org.apache.hadoop.ipc.rpc$server.call (rpc.java:1026) at Org.apache.hadoop.ipc.server$handler$1.run (server.java:1986) at Org.apache.hadoop.ipc.server$handler$1.run ( server.java:1982) at java.security.AccessController.doPrivileged (Nativemethod) at Javax.security.auth.Subject.doAs (subject.java:415) at Org.apache.hadoop.security.UserGroupInformation.doAs (usergroupinformation.java:1554) at Org.apache.hadoop.ipc.server$handLer.run (server.java:1980)
Http://qnalist.com/questions/4884816/how-to-execute-recommenderjob-without-preference-value
Here's someone who solves this: the segmentation of the field in the data file turns out to be a space and a comma. The data file I used was a comma. Or to report the mistake.
Workaround: Apache Hadoop
Setting method Reference: hadoop2.2+mahout0.9-Push Cool http://www.tuicool.com/articles/ryU7Ff
Cdh
Version: 5.1.3 experience feels that the same technology often differs when used with Hadoop in CDH and Apache. Mahout should also use Cloudera's own.
Installing Mahout:sudo yum install Mahout in CentOS
It is provided by Cloudera website.
vi/etc/hadoop/conf/hadoop-env.sh
add after Hadoop_classpath:/usr/lib/mahout/*:/usr/lib/mahout/lib/*Now it's ready to run:
mahout Hadoop jar Jar.jar Cmd.client.app.ClientAPP
or Hadoop jar Jar.jar Cmd.client.app.ClientAPP can run
How Hadoop runs Mahout problem resolution