Using PHP to invoke Lucene package for full-text Search _php Tutorial

Source: Internet
Author: User
Tags createindex php source code
Due to the need for work, PHP is required to achieve a large number of Web site full-text search,
And now the most popular full-text search engine library is Lucene,
It is a sub-project of Apache Jakarta and provides a simple and useful API
With these APIs, you can perform full-text retrieval of any underlying text data, including the database.


Because PHP natively supports calling external Java classes, it first writes a class in Java,
This class implements two methods by invoking the Lucene API:

public string CreateIndex (String indexdir_path,string datadir_path)
public string Searchword (String ss,string index_path)
Where CreateIndex is the creation of an index method,
The two parameters passed in are Indexdir_path (the directory of the index files), Datadir_path (the indexed file directory), returning the indexed file list string,
The other is Searchword, which retrieves the index by passing in the keyword parameter (ss), and Index_path is the directory of the index file. Returns all retrieved files.

Here is the source code, very simple, we can refer to: Txtfileindexer.java

and the PHP program calls these two methods, to achieve the call to Lucene, so as to achieve the purpose of full-text search.
The method for invoking PHP is as follows:
First create an instance of the Txtfileindexer class that we wrote,

$TF = new Java (testlucene.txtfileindexer);

The call is then made to the normal PHP class calling method, and the index is created first:

$data _path = "F:/test/php_lucene/htdocs/data/manual"; Define the directory where the content is indexed
$index _path = "F:/test/php_lucene/htdocs/data/search"; Define the generated index file storage directory
$s = $TF->createindex ($index _path, $data _path); Methods for calling Java classes
Print $s; Print the returned results

Try the search again this time:

$index _path = "F:/test/php_lucene/htdocs/data/search"; Define the generated index file storage directory
$s = $TF->searchword ("Here are keyword for search", $index _path);
Print $s;

Also note the path to the Java class, which you can set in PHP

Java_require ("f:/test/php_lucene/htdocs/lib/"); This is an example where my classes and lucene are placed in this directory

That's fine, isn't it simple.

PHP Source code: test.php


Next I'm going to talk about the environment,
The first need to have the Java SDK, is necessary, I use the 1.4.2 version, the other version should be no problem.
PHP5, tried PHP4, should be able.

Because the Java extension of the PHP5 is not tuned, and previously used to call Java inefficient, very slow, so the use of Php/java Bridge this project.

1. Download Javabridge
url:http://sourceforge.net/projects/php-java-bridge/
The current version is
Php-java-bridge_3.0.8_j2ee.zip

After unpacking the
Javabridgeweb-infcgijava-x86-windows.dll
Javabridgeweb-inflibjavabridge.jar
Copy to the C:phpext directory and put
Java-x86-windows.dll renamed to Php_java.dll


2. Modify PHP.ini (example)
Extension=php_java.dll

[Java]
Java.class.path = "C:phpextjavabridge.jar; F:testphp_lucenehtdocs "
Java.java_home = "C:j2sdk1.4.2_10"
Java.library.path = "C:phpext; F:testphp_lucenehtdocs "

3. Restart Apache.

4. You can find some files to index
In test.php, you can modify the path of the index file and the data file.
The Txtfileindexer.java 37 line restricts files indexed only to HTML suffixes and can be modified if necessary.

Depending on the current situation (Javabridge support for Linux and FreeBSD), you can fully
Linux or Freebsd/apache2/php4/lucene/javabridge
Operating environment.

http://www.bkjia.com/PHPjc/486089.html www.bkjia.com true http://www.bkjia.com/PHPjc/486089.html techarticle because of the work needs, the use of PHP to achieve a large number of Web site full-text search, and currently the most popular full-text search engine library is Lucene, it is Apache Jakart ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.