We generally browse the site to provide search capabilities. Some are search databases, simple but limited. Some are called third-party search interface Internet search, powerful.
In any case, it is a search that performs data matching in the background.
At present, Java full-text search is solr,elasticsearch used mostly. Online rumors Redis-search can also be full-text search. I probably Baidu a bit, in fact, Redis-search is not mature yet.
Last month played with SOLR and Elasticsearch. Now summarize the experience.
First of all, SOLR is strong and quick to get started. Basically download, start. You can use it.
Elasticsearch is similar, but Elasticsearch does not have a Visual Web management program.
SOLR can index everything, including pictures, Office documents, Pdf,xml,txt,json omnipotent. Elasticsearch appears to receive only JSON data.
On the performance, the basic is very fast. Online said Elasticsearch near real-time search, this remains to be demonstrated.
SOLR has two ways to build indexes. One is the DataImport way. In this way, you simply configure the Manager-schema.xml file.
Another is implemented with the SOLR-SOLRJ jar package program.
Put a code
/** * Index Pdf,word,txt and other document methods * * @param myFile *--File Object * @param filename *--File name * * @param Filetyp E *--File type * @param otherfields *--record Field map Object * * @throws IOException * @throws solrserverexceptio n */public void Indexfiles (File myFile, String fileName, String FileType, String tableName, String uniquekey,string Primar Ykeyvalue) throws IOException, solrserverexception {contentstreamupdaterequest up = new Contentstreamupdaterequest ("/ Update/extract "); String contentType = ""; if (Stringutil.isempty (FileType)) {int dotposition = Filename.indexof ('. '); String Tmpfiletype = filename.substring (dotposition + 1); ContentType = "application/" + Tmpfiletype;} else {contentType = FileType;} Up.addfile (MyFile, ContentType); Modifiablesolrparams p = new Modifiablesolrparams ();p. Add ("Literal.id", UniqueKey);p. Add ("literal.doc_id", Primarykeyvalue);p. Add ("Literal.doc_table_name", TableName);p. Add ("Literal.doc_type", "file");//document type is file P.add (" Literal.doc_file_naMe ", fileName);//The document type is file P.add (" Fmap.content "," Doc_content ");p. Add (" Extractformat "," text ");p. Add (" Captureattr "," False "); solrclient = Solrclient.getinstance (); Up.setparams (P); Up.setaction (AbstractUpdateRequest.ACTION.COMMIT, True, True); Solrclient.request (up);}
In addition, the program builds the index to define the fields that need to be indexed. Like what:
<field name= "my_content" type= "Text_ik" indexed= "true" stored= "true"/>
Define the field category, declare the use of the Ikanalyzer word breaker (this is supported in Chinese and need to copy the corresponding Jar and config package to the SOLR publishing directory)
<fieldtype name= "Text_ik" class= "SOLR. TextField ">
<analyzer type= "index" class= "Org.wltea.analyzer.lucene.IKAnalyzer"/>
<analyzer type= "Query" class= "Org.wltea.analyzer.lucene.IKAnalyzer"/>
</fieldType>
Use of the program query, you need to specify the query field
if (Stringutils.isempty (Searchvo.getkeyword ())) { query.setquery ("*:*"); Else { query.setquery ("my_content:" + Searchvo.getkeyword ()); } if (Stringutils.isnoneempty (Searchvo.getnodecode ())) { query.addfilterquery ("my_table_name:" + Searchvo.getnodecode ()); }
Before using *xx* to do fuzzy query, the actual full-text search does not need. Because the search engine has already put the big text content participle. If you are simply searching for a small amount of string information. Then use the index string type string. There is no need to use text.
And it's better to do a fuzzy query with *xx* to check string strings. However, the string field has a length limit. This should be noted.
About full-text search engines