Lucene, an open source search engine Development toolkit from Apache, not only provides core search functionality, but also provides a number of other plug-in features, such as the spell checker module.
The Search spelling module implementation class is in the Lucene-suggest-x.xx.x.jar package named Org.apache.lucene.search.spell, where the core implementation of the spell checking feature has 3 classes,
respectively:spellchecker,directspellchecker,wordbreakspellchecker;
The 3 classes provide different ways of checking spelling, with the following differences:
spellchecker: Provides the original spelling checker, which needs to be re-indexed before the spell check (indexed by a txt dictionary file or a field indexed by an index file) before spelling can be checked;
spellchecker Source analysis View the following website: HTTP://WWW.TUICOOL.COM/ARTICLES/NAIBJM
Directspellchecker: Provides an improved spell checking feature that allows you to check spelling directly with an existing index file without having to re-index it (the SOLR system defaults to check spelling in this way);
Wordbreakspellchecker: There is no need to re-build the index, you can use the existing index for spell checking.
Spellchecker using:
There are three ways to build an index:
plaintextdictionary: Initializing index with TXT file
lucenedictionary: Initializing an index with one of the fields of an existing index
highfrequencydictionary: Initializes the index with a field from an existing index, but each index entry must meet a certain occurrence rate
1 //New Index Directory2String Spellindexpath ="D:\\newpath";3 //an existing index directory4String Oriindexpath = "D:\\oripath";5 //Dictionary Files6String Dicfilepath ="D:\\txt\\dic.txt";7 8 //Catalogue9Directory directory = Fsdirectory.open ((NewFile (Spellindexpath)). Topath ());Ten OneSpellchecker spellchecker =Newspellchecker (directory); A - //The following steps are used toInitializing Indexes -Indexreader reader = Directoryreader.open (Fsdirectory.open (NewFile (Oriindexpath)). Topath ()); the //leverage an existing index -Dictionary Dictionary =Newlucenedictionary (reader, fieldName); - //or use the txt dictionary file - //Dictionary Dictionary = new Plaintextdictionary ((New File (Dicfilepath)). Topath ()); +Indexwriterconfig config =NewIndexwriterconfig (NewStandardAnalyzer ()); -Spellchecker.indexdictionary (dictionary, config,true); + AString Queryword = "Beijink"; at intNumsug = 10; - //spell Check -String[] Suggestions =spellchecker.suggestsimilar (Queryword, numsug); - - reader.close (); - spellchecker.close (); inDirectory.close ();
Directspellchecker using:
1 directspellchecker checker = new Span style= "color: #000000;" > Directspellchecker (); 2 String readerpath = "D:\\path" ; 3 indexreader reader = Directoryreader.open ( Fsdirectory.open ( 4 (new File (Readerpath)). Topath ()); 5 term term = new term ("FieldName", " QueryText " 6 int Numsug = 10 7 suggestword[] suggestions = checker.suggestsimilar (term, numsug, reader);
Lucene spell Check Module