About full-text search engines

Last Update:2016-12-03 Source: Internet

Author: User

Tags solr

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

We generally browse the site to provide search capabilities. Some are search databases, simple but limited. Some are called third-party search interface Internet search, powerful.
In any case, it is a search that performs data matching in the background.

At present, Java full-text search is solr,elasticsearch used mostly. Online rumors Redis-search can also be full-text search. I probably Baidu a bit, in fact, Redis-search is not mature yet.
Last month played with SOLR and Elasticsearch. Now summarize the experience.

First of all, SOLR is strong and quick to get started. Basically download, start. You can use it.

Elasticsearch is similar, but Elasticsearch does not have a Visual Web management program.

SOLR can index everything, including pictures, Office documents, Pdf,xml,txt,json omnipotent. Elasticsearch appears to receive only JSON data.
On the performance, the basic is very fast. Online said Elasticsearch near real-time search, this remains to be demonstrated.

SOLR has two ways to build indexes. One is the DataImport way. In this way, you simply configure the Manager-schema.xml file.

Another is implemented with the SOLR-SOLRJ jar package program.

Put a code

/** * Index Pdf,word,txt and other document methods * * @param myFile *--File Object * @param filename *--File name * * @param Filetyp E *--File type * @param otherfields *--record Field map Object * * @throws IOException * @throws solrserverexceptio n */public void Indexfiles (File myFile, String fileName, String FileType, String tableName, String uniquekey,string Primar Ykeyvalue) throws IOException, solrserverexception {contentstreamupdaterequest up = new Contentstreamupdaterequest ("/ Update/extract "); String contentType = ""; if (Stringutil.isempty (FileType)) {int dotposition = Filename.indexof ('. '); String Tmpfiletype = filename.substring (dotposition + 1); ContentType = "application/" + Tmpfiletype;} else {contentType = FileType;} Up.addfile (MyFile, ContentType); Modifiablesolrparams p = new Modifiablesolrparams ();p. Add ("Literal.id", UniqueKey);p. Add ("literal.doc_id", Primarykeyvalue);p. Add ("Literal.doc_table_name", TableName);p. Add ("Literal.doc_type", "file");//document type is file P.add (" Literal.doc_file_naMe ", fileName);//The document type is file P.add (" Fmap.content "," Doc_content ");p. Add (" Extractformat "," text ");p. Add (" Captureattr "," False "); solrclient = Solrclient.getinstance (); Up.setparams (P); Up.setaction (AbstractUpdateRequest.ACTION.COMMIT, True, True); Solrclient.request (up);}

In addition, the program builds the index to define the fields that need to be indexed. Like what:

Define the field category, declare the use of the Ikanalyzer word breaker (this is supported in Chinese and need to copy the corresponding Jar and config package to the SOLR publishing directory)

Use of the program query, you need to specify the query field

if (Stringutils.isempty (Searchvo.getkeyword ())) {            query.setquery ("*:*");         Else {            query.setquery ("my_content:" + Searchvo.getkeyword ());        }         if (Stringutils.isnoneempty (Searchvo.getnodecode ())) {            query.addfilterquery ("my_table_name:" +  Searchvo.getnodecode ());        }

Before using *xx* to do fuzzy query, the actual full-text search does not need. Because the search engine has already put the big text content participle. If you are simply searching for a small amount of string information. Then use the index string type string. There is no need to use text.

And it's better to do a fuzzy query with *xx* to check string strings. However, the string field has a length limit. This should be noted.

About full-text search engines

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More