"Go" How to configure Heritrix under Eclipse

Source: Internet
Author: User
Tags deprecated

How to configure Heritrix under Eclipse

On other posts you see articles with Eclipse configuration Heritrix 1.14.4, and here are a lot of things to refer to from there. such as http://extjs2.javaeye.com/blog/699751
However, there is some further explanation for the configuration.

The configuration process for Eclipse configuration Heritrix 1.14.4 is as follows:

1. First download from http://sourceforge.net/projects/archive-crawler/
Heritrix-1.14.4.zip and Heritrix-1.14.4-src.zip (Windows)

2. Create a project for Java project in eclipse (can be named Heritrix)

3. Copy the COM, org, and St Three folders from the Src/java in the Heritrix-1.14.4-src.zip decompression to the project SRC.

4. Copy the Conf folder from Heritrix-1.14.4-src.zip decompression in src to the project root directory.

5. Copy the Heritrix-1.14.4-src.zip unzip in the Lib folder to the project root directory.

6. Copy the Tlds-alpha-by-domain.txt file from the Src/resources/org/archive/util in the Heritrix-1.14.4-src.zip decompression to the Org.archive.util package in the project.

7. Copy the WebApps folder from the Heritrix-1.14.4.zip decompression to the project root directory.
If the folder name is not WebApps, you need to make the appropriate changes in Heritrix.java.

Java code:/** * @throws IOException * @return Returns the directory under which reside the WAR files * we ' re to load I   Nto the servlet container. * /public static File Getwarsdir () throws IOException { return getsubdir ("     WebApps "); }

/** * @throwsIOException * @return Returns the directory under which reside the WAR files *we ' re to load into the servlet Container. */public static File Getwarsdir () throws IOException {return Getsubdir ("WebApps");}

8. Modify the configuration file to find the Heritrix.properties file under Conf

Java code://Set version heritrix.version= 1.14.4

Set User Password Heritrix.cmdline.admin = admin:admin

Set Port Heritrix.cmdline.port = 8080

9. Introduce the jar package to the project and bring all the jar packages below the Lib into the project.

10.Eclipse Import Heritrix, error cannot find the class Sun.net.www.protocol.file.fileurlconnection,sun package is a protected package that is only available by default for sun company software. Eclipse will make an error and use warning for protection. Compiler, errors/warnings-> Deprecated andtrstricted API, Windows->preferences, Java, Forbidden Reference (access rules): Change towarning

11. Add the Configuration folder. If you run Heritrix, there are no options available in the configuration page, this step resolves the issue. In the project found Org.archive.crawler.Heritrix.java right-click Run mode configuration, select Classpath, select User Entries-Advanced, select Add Folders, add the Conf folder.

Click Run to start running

Java code:

16:17:09.500event starting jetty/4.2.23
16:17:09.843 EVENT Started Webapplicationcontext[/,heritrix Console]
16:17:09.968 EVENT Started SocketListener on 127.0.0.1:8080
16:17:09.968 EVENT Started
Heritrix version:1.14.4

Http://www.cnblogs.com/sl-shilong/articles/2829411.html

Meet the problem and fix:

Heritrix.java code file in the statement: "Import sun.net.www.protocol.file.FileURLConnection;"

The error is as follows:

"The type fileurlconnection is not accessible due to restriction onrequired library C:\Programe Files\java\jre6\lib\rt.jar ”

How can I resolve this?

Add the Heritirx version is 1.14.4

Programming Xiao Qiang answered on 2012-03-07 11:31

This is the JRE access limit resulting in an error, right-click on the Myheritrix project to select the BuildPath? Configure Build Path ..., then select the Library tab, remove the JRE System Library and then re-import it to fix it. (OK)

or select Windows? Preferences? Java? Compiler? Errors/warnings "Find" Forbidden reference (access rules) under Deprecated and restricted API, change the default setting "Error" to "Warning" or " Ignore ".

"Go" How to configure Heritrix under Eclipse

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.