Solrurl is not set, indexing'll be skipped ...
Crawl started In:crwal
Rooturldir = URLs
Threads = 10
Depth = 2
Solrurl=null
TopN = 2
Injector:starting at 2012-04-20 14:39:30
Injector:crawldb:crwal/crawldb
Injector:urlDir:urls
Injector:converting injected URLs to crawl DB entries.
Exception in thread "main" Java.io.IOException:Job failed!
At Org.apache.hadoop.mapred.JobClient.runJob (jobclient.java:1265)
At Org.apache.nutch.crawl.Injector.inject (injector.java:217)
At Org.apache.nutch.crawl.Crawl.run (crawl.java:127)
At Org.apache.hadoop.util.ToolRunner.run (toolrunner.java:65)
At Org.apache.nutch.crawl.Crawl.main (crawl.java:55)
Java.lang.RuntimeException:Error in configuring Object
Atorg.apache.hadoop.util.ReflectionUtils.setJobConf (reflectionutils.java:93)
Atorg.apache.hadoop.util.ReflectionUtils.setConf (reflectionutils.java:64)
Atorg.apache.hadoop.util.ReflectionUtils.newInstance (reflectionutils.java:117)
Atorg.apache.hadoop.mapred.MapTask.runOldMapper (maptask.java:354)
Atorg.apache.hadoop.mapred.MapTask.run (maptask.java:307)
Atorg.apache.hadoop.mapred.localjobrunner$job.run (localjobrunner.java:177)
caused by:java.lang.reflect.InvocationTargetException
ATSUN.REFLECT.NATIVEMETHODACCESSORIMPL.INVOKE0 (Native Method)
Atsun.reflect.NativeMethodAccessorImpl.invoke (Unknown Source)
Atsun.reflect.DelegatingMethodAccessorImpl.invoke (Unknown Source)
Atjava.lang.reflect.Method.invoke (Unknown Source)
Atorg.apache.hadoop.util.ReflectionUtils.setJobConf (reflectionutils.java:88)
... 5 more
caused by:java.lang.RuntimeException:Error in configuring object
Atorg.apache.hadoop.util.ReflectionUtils.setJobConf (reflectionutils.java:93)
Atorg.apache.hadoop.util.ReflectionUtils.setConf (reflectionutils.java:64)
Atorg.apache.hadoop.util.ReflectionUtils.newInstance (reflectionutils.java:117)
Atorg.apache.hadoop.mapred.MapRunner.configure (maprunner.java:34)
... Ten more
caused by:java.lang.reflect.InvocationTargetException
ATSUN.REFLECT.NATIVEMETHODACCESSORIMPL.INVOKE0 (Native Method)
Atsun.reflect.NativeMethodAccessorImpl.invoke (Unknown Source)
Atsun.reflect.DelegatingMethodAccessorImpl.invoke (Unknown Source)
Atjava.lang.reflect.Method.invoke (Unknown Source)
Atorg.apache.hadoop.util.ReflectionUtils.setJobConf (reflectionutils.java:88)
... More
caused by:java.lang.IllegalArgumentException:plugin.folders is not defined
Atorg.apache.nutch.plugin.PluginManifestParser.parsePluginFolder (pluginmanifestparser.java:78)
Atorg.apache.nutch.plugin.pluginrepository.<init> (pluginrepository.java:72)
Atorg.apache.nutch.plugin.PluginRepository.get (pluginrepository.java:99)
Atorg.apache.nutch.net.urlnormalizers.<init> (urlnormalizers.java:117)
Atorg.apache.nutch.crawl.injector$injectmapper.configure (INJECTOR.JAVA:70)
... More
12/04/20 10:14:44 INFOmapred.JobClient:map 0% reduce 0%
12/04/20 10:14:44 INFOmapred.JobClient:Job complete:job_local_0001
12/04/20 10:14:44 infomapred.jobclient:counters:0
Exception in thread "main" Java.io.IOException:Job failed!
Atorg.apache.hadoop.mapred.JobClient.runJob (jobclient.java:1252)
Atorg.apache.nutch.crawl.Injector.inject (injector.java:217)
Atorg.apache.nutch.crawl.Crawl.run (crawl.java:127)
Atorg.apache.hadoop.util.ToolRunner.run (toolrunner.java:65)
Atorg.apache.nutch.crawl.Crawl.main (crawl.java:55)
first of all, don't blame me for so many error messages, just to make it easier for everyone to find here.
to solve this problem is to nutch-default.xml the
<property>
<name>plugin.folders</name>
<value>./src/plugin</value>
<description>directories where Nutch plugins is located. each
Element may be a relative or absolute path. If Absolute, it is used
As is. If relative, it is searched for on the classpath.</description>
</property>
You can change it in red.
Good luck, everyone.
Add the steps to run Nutch on Eclipse for a day before straightened out, but thanks to the students in North north. Ha ha
Http://wiki.apache.org/nutch/RunNutchInEclipse English Authority Office
Prepare for the job
1, install Subeclpse plug-in, install Ivyde plugin, install maven plugin
2. Check out code Https://svn.apache.org/repos/asf/nutch/trunk
3. Delete src and use Src/bin,src/java,src/test,src/testsource,src/plugin/xx/src/java,src/plugin/xx/src/test as folder
4, plus two jar package, see English can understand
5. On the Libraries page, on the right, click Add Class floder to select Nutch conf.
6, or on Libraries page, right click on Add Library > Ivyde Managed Dependencies > select Ivy/ivy.xml
7, build.xml----ant a bit
8, refresh the Nutch project, under the Conf added nutch-site.xml,regex-urlfilter.xml, configuration content
9. Modify in Nutch-default.xml
<property>
<name>plugin.folders</name>
<value>./src/plugin</value>
<description>directories where Nutch plugins is located. each
Element may be a relative or absolute path. If Absolute, it is used
As is. If relative, it is searched for on the classpath.</description>
</property>
Is critical
10, in the root directory to build a folder URLs, folders under the Seed.txt,seed.txt write to crawl page URL
11. Build.xml re-compile (ant)
12. Implementation