Before doing to the turntable network, I have publicly published non-full-text search code, the need for friends want to go to read my blog. This article mainly discusses how to carry out full-text search, because I spent a long time to design a new book: The point of view, the requirements of full-text search is still very high, so I spent a lot of time to study full-text search, you can first experience the following: click My search. No more nonsense, just on the code:
PublicMap<string,object> articlesearchalgorithms (searchcondition condition,indexsearcher searcher)throwsparseexception, ioexception{Map<String,Object> map =NewHashmap<string,object>(); String[] Filedslist=condition.getfiledslist (); String KeyWord=Condition.getkeyword (); intCurrentpage=Condition.getcurrentpage (); intPagesize=condition.getpagesize (); String SortField=Condition.getsortfield (); BooleanIsasc=Condition.isdesc (); String sdate=condition.getsdate (); String eDate=condition.getedate (); String classify=condition.getclassify (); //filtering Terminator characterskeyword=Escapeexprspecialword (KeyWord); Booleanquery Q1=NewBooleanquery (); Booleanquery Q2=NewBooleanquery (); Booleanquery Booleanquery=NewBooleanquery ();//Boolean Query if(classify!=NULL&& (Classify.equals ("Guanzhi") | | Classify.equals ("opinion") | | Classify.equals ("Write")) {String typeId= "1";//Default Remarks if(Classify.equals ("Guanzhi") ) {typeId= "2"; } if(Classify.equals ("Opinion") ) {typeId= "3"; } Query termquery=NewTermquery (NewTerm ("TypeId", typeId)); Q1.add (Termquery,booleanclause.occur.must); } if(sdate!=NULL&&edate!=NULL){//whether a range query is determined by these two parametersQuery Rangequery =NewTermrangequery ("Writingtime",NewBytesref (Sdate),NewBytesref (EDate),true,true); Q1.add (Rangequery,booleanclause.occur.must); } Sort Sort=NewSort ();//SortSort.setsort (Sortfield.field_score); if(sortfield!=NULL) {Sort.setsort (NewSortField (SortField, SortField.Type.STRING, ISASC)); } intStart = (currentPage-1) *pageSize; intHM = Start +pageSize; Topfieldcollector Res= Topfieldcollector.create (SORT,HM,false,false,false,false); //Exact Match queryTerm t0=NewTerm (filedslist[1],keyword); Termquery Termquery=NewTermquery (t0);//two highly-matched queriesQ2.add (termquery,booleanclause.occur.should); //prefix matchingTerm t1=NewTerm (filedslist[1],keyword); Prefixquery Prefixquery=Newprefixquery (t1); Q2.add (prefixquery,booleanclause.occur.should); //phrase, similarity matching, suitable for the content of participle for(inti=0;i<filedslist.length;i++) {//Multi-field term query algorithm if(i!=1) {phrasequery phrasequery=NewPhrasequery (); Term TS0=NewTerm (Filedslist[i],keyword); Phrasequery.add (TS0); Fuzzyquery Fquery=NewFuzzyquery (NewTerm (Filedslist[i],keyword), 2);//Final Similarity QueryQ2.add (phrasequery,booleanclause.occur.should); Q2.add (fquery,booleanclause.occur.should);//suffix similar to take out}} multifieldqueryparser queryparser=NewMultifieldqueryparser (Version.lucene_47,filedslist,analyzer); Queryparser.setdefaultoperator (Queryparser.and_operator); Query Query=Queryparser.parse (KeyWord); Q2.add (query,booleanclause.occur.should); //must be logically judged, otherwise the result is different if(q1!=NULL&& q1.tostring (). Length () >0) {booleanquery.add (q1,booleanclause.occur.must); } if(q2!=NULL&& q2.tostring (). Length () >0) {booleanquery.add (q2,booleanclause.occur.must); } searcher.search (Booleanquery, RES); LongAmount =res.gettotalhits ();topdocs TDs=Res.topdocs (Start, pageSize); Map.put ("Amount", amount); Map.put ("TDS", TDS); Map.put ("Query", Booleanquery); returnmap; }
Note: The search criteria for the above code (searchcondition) is the specific needs of the viewpoint network, you can make changes according to your own search conditions, it is also difficult to adapt to all readers.
PublicMap<string, object> searcharticle (searchcondition condition)throwsexception{Map<String,Object> map =NewHashmap<string,object>(); List<Write> list=NewArraylist<write>(); Directoryreader Reader=Condition.getreader (); String URL=Condition.geturl (); BooleanIshighligth=condition.ishighlight (); String KeyWord=Condition.getkeyword (); Indexsearcher Searcher=Getsearcher (Reader,url); Try{Map<String,Object> output=articlesearchalgorithms (Condition,searcher); if(output==NULL) {Map.put ("Amount", 0L); Map.put ("Source",NULL); returnmap; } map.put ("Amount", Output.get ("Amount")); Topdocs TDs= (Topdocs) output.get ("TDs"); scoredoc[] SD=Tds.scoredocs; Query Query= (query) output.get ("Query"); for(inti = 0; i < sd.length; i++) {Document doc=Searcher.doc (Sd[i].doc); String ID= Doc.get ("id"); /**********************start************************* needs to be dealt with together ********************/String Temp=doc.get ("title"); String title=temp;//not highlighted by default if(ishighligth) {//Highlight article titleHighlighter Highlightertitle =NewHighlighter (Simplehtmlformatter,Newqueryscorer (query)); Highlightertitle.settextfragmenter (NewSimplefragmenter (40));//Word lengthTokenstream ts = analyzer.tokenstream ("title",NewStringReader (temp)); Title=highlightertitle.getbestfragment (ts,temp); if(title==NULL) {title=temp.replace (KeyWord, "<span style= ' color:red ' >" +keyword+ "</span>");//Highlight Handle plugin bug, add this sentence to avoid}} String Temp1=htmlendecode.htmlencode (Doc.get ("content")); String content=TEMP1;//use your own encapsulated method to escape if(ishighligth) {//do highlight, contentHighlighter Highlightercontent =NewHighlighter (Simplehtmlformatter,Newqueryscorer (query)); Highlightercontent.settextfragmenter (NewSimplefragmenter (constant.highlight_content_length));//Word length//temp1=stringescapeutils.escapehtml (TEMP1);//escaping Chinese characters causes highlighting to failTokenstream ts1 = Analyzer.tokenstream ("Content",NewStringReader (TEMP1)); Content=highlightercontent.getbestfragment (TS1,TEMP1); if(content==NULL) {content=temp1.replace (KeyWord, "<span style= ' color:red ' >" +keyword+ "</span>");//Highlight Handle plugin bug, add this sentence to avoid//Assuming this happens, the other highlights will automaticallyContent=subcontent (content);//interception processingContent=htmlendecode.htmldecode (content);//HTML decodingContent=substringhtml.sub (content,constant.highlight_content_length); } } /*---------------------------------------the ever-changing data----------------------------*/Write Write=writedao.getarticle (Long.parselong (id)); if(write!=NULL) {Write.settitle (title); Write.setcontent (content); Date Writingtime=Write.getwritingtime (); String Timegap=dateutil.dategap (Writingtime);//TimegapWrite.settimegap (TIMEGAP); List.add (write); } } }Catch(Exception e) {e.printstacktrace (); } map.put ("Source", list); returnmap; }
Note above, this is the specific search code, different application scenarios have different requirements, please follow your own requirements to encapsulate objects, query database, etc., code is not reserved, absolutely available.
If there is any doubt can add QQ group: 284205104 If the group is full of trouble to go to the turntable to find the latest group add can, thank you for your reading.
Full-Text Search algorithm function implementation of search engine (Lucene based)