Use PHP to call Lucene package back for full-text retrieval

Source: Internet
Author: User
Tags createindex
Using PHP to call Lucene package to achieve full-text retrieval [full-text retrieval] using PHP to call Lucene package to achieve full-text retrieval; using PHP to call Lucene package to achieve full-text retrieval
[Full-text retrieval] use PHP to call the Lucene package for full-text retrieval

--------------------------------------------

Http://www.chinaunix.net author: z33 Published on: 17:43:53
[Comment] [view original] [Php discussion board] [close]

/* Relay the following information */
Author: [url = http://spaces.msn.com/members/newbdez33/#zhang Jie
URL: http://spaces.msn.com/members/newbdez33/
Http://www.phpboom.com/



Due to work needs, you need to use PHP to perform full-text searches on a large number of websites,
Besides, Lucene is the most popular search engine library for full-text search,
It is a sub-project of Apache Jakarta and provides simple and practical APIs,
With these APIs, you can retrieve full-text data of any basic text (including databases.


Because PHP itself supports calling external Java classes, a class is first written in Java,
This class implements two methods by calling Lucene API:

* Public String createIndex (String indexDir_path, String dataDir_path)
* Public String searchword (String ss, String index_path)

CreateIndex is the index creation method,
Two parameters are passed in: indexDir_path (Directory of the index file) and dataDir_path (Directory of the file to be indexed) to return the list of indexed files,
The other is searchword, which is used to retrieve the index through the input keyword parameter (ss). index_path is the Directory of the index file. Returns all Retrieved files.

The source code is very simple. for details, refer to [url = http://newbdez33.googlepages.com/txtfileindexer.java#txtfileindexer.java

The PHP program calls these two methods to call Lucene, so as to achieve the purpose of full-text retrieval.
The PHP call method is as follows:
First, create an instance of the TxtFileIndexer class we wrote,

$ Tf = new Java ('testlucene. TxtFileIndexer ');

Then, call the method based on the normal PHP Class Call method. First, create the index:

$ Data_path = "F:/test/php_lucene/htdocs/data/manual"; // defines the Directory of the indexed content
$ Index_path = "F:/test/php_lucene/htdocs/data/search"; // define the directory for storing the generated index files
$ S = $ tf-> createIndex ($ index_path, $ data_path); // call the Java class method
Print $ s; // print the returned result

Try again this time:

$ Index_path = "F:/test/php_lucene/htdocs/data/search"; // define the directory for storing the generated index files
$ S = $ tf-> searchword ("here is keyword for search", $ index_path );
Print $ s;

In addition, pay attention to the Java class path, which can be set in PHP

Java_require ("F:/test/php_lucene/htdocs/lib/"); // This is an example. put both my class and Lucene under this directory.

This way, isn't it easy.

PHP source code: [url = http://newbdez33.googlepages.com/test.php#test.php


Next, let's talk about environment configuration,
First, you must have a Java SDK. I use version 1.4.2. Other versions are fine.
PHP5: you have tried PHP4. it should be OK.

The Java extension in PHP5 was not fully tuned, and it was very inefficient to call Java before. Therefore, the Php/Java Bridge project was used.

1. download JavaBridge
URL: http://sourceforge.net/projects/php-java-bridge/
The current version is
[Url = http://prdownloads.sourceforge.net/php-java-bridge/php-java-bridge_3.0.8_j2ee.zip? Download]php-java-bridge_3.0.8_j2ee.zip

After unpacking
JavaBridge \ WEB-INF \ cgi \ java-x86-windows.dll
JavaBridge \ WEB-INF \ lib \ JavaBridge. jar
Copy to the c: \ php \ ext directory and
Java-x86-windows.dll renamed php_java.dll


2. modify php. ini (example)
Extension = php_java.dll

[Java]
Java. class. path = "C: \ php \ ext \ JavaBridge. jar; F: \ test \ php_lucene \ htdocs"
Java. java_home = "C: \ j2sdk1.4.2 _ 10"
Java. library. path = "c: \ php \ ext; F: \ test \ php_lucene \ htdocs"

3. restart Apache.

4. you can find some files for indexing.
You can modify the path of the index file and data file in test. php.
Row 37 of TxtFileIndexer. java limits the indexing of only html files. you can modify the following if necessary.

Based on the current situation (JavaBridge supports Linux and Freebsd), you can
Linux or freebsd/apache2/php4/lucene/JavaBridge
Environment.



This article may be updated at any time. In addition, you can visit:
[Url = http://newbdez33.googlepages.com/php_?e=use PHP to call the lucenepackage for full-text search

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.