E.net series (6)

Source: Internet
Author: User
E.net series (6)
Under searchThis article introduces various query statements in Lucene and their simplified methods in combination with test cases.

Through this article, you will understand the basic query statements of Lucene and learn more about all the test code. Source code download Specific query statementAfter learning about SQL, do you want to know about the query syntax tree? Here we will briefly introduce some query statements that can be directly used by Lucene. 1. query a specific word in termquery, which has been introduced in the example at the beginning of the article. It is often used to query keywords. [Test] Public void keyword (){Indexsearcher searcher = new indexsearcher (directory );Term T = new term ("ISBN", "1930110995 ");Query query = new termquery (t );Hits hits = searcher. Search (query );Assert. areequal (1, hits. Length (), "JUnit in action ");}Note that the keywords in Lucene require the user to ensure uniqueness. Termquery And QueryparseAs long as there is only one word in the parse method of queryparse, it is automatically converted to termquery. 2. rangequery is used for the query range, usually for time. Let's look at the example: Namespace dotlucene. Inaction. basicsearch {Public class rangequerytest: liatestcase{Private term begin, end; [Setup]
Protected override void Init ()
{
Begin = new term ("pubmonth", "200004 ");
End = new term ("pubmonth", "200206 ");
Base. INIT ();
}
[Test]
Public void random sive ()
{
Rangequery query = new rangequery (begin, end, true );
Indexsearcher searcher = new indexsearcher (directory );
Hits hits = searcher. Search (query ); Assert. areequal (1, hits. Length ());} [Test]
Public void exclusive ()
{
Rangequery query = new rangequery (begin, end, false );
Indexsearcher searcher = new indexsearcher (directory );
Hits hits = searcher. Search (query );
Assert. areequal (0, hits. Length ());
}
}
}
The third parameter of rangequery is used to indicate whether the start and end dates are included. Rangequery And Queryparse [Test] Public void testqueryparser (){Query query = queryparser. parse ("pubmonth: [200004 to 200206]", "subject", new simpleanalyzer ());Assert. istrue (query is rangequery );Indexsearcher searcher = new indexsearcher (directory );Hits hits = searcher. Search (query ); Query = queryparser. parse ("{200004 to 200206}", "pubmonth", new simpleanalyzer ());
Hits = searcher. Search (query );
Assert. areequal (0, hits. Length (), "jdwa in 200206 ");
}
Lucene uses [] and {} to indicate including and not including. 3. prefixqueryIt is used to search whether a specific prefix is included. It is often used for catalog retrieval. [Test] Public void testprefixquery (){Prefixquery query = new prefixquery (new term ("category", "/computers ")); Indexsearcher searcher = new indexsearcher (directory );
Hits hits = searcher. Search (query );
Assert. areequal (2, hits. Length ());

Query = new prefixquery (new term ("category", "/computers/JUnit "));
Hits = searcher. Search (query );
Assert. areequal (1, hits. Length (), "JUnit in action ");
}
Prefixquery And Queryparse [Test]
Public void testqueryparser ()
{
Queryparser QP = new queryparser ("category", new simpleanalyzer ());
QP. setlowercasewildcardterms (false );
Query query = QP. parse ("/computers *");
Console. Out. writeline ("query = {0}", query. tostring ());
Indexsearcher searcher = new indexsearcher (directory );
Hits hits = searcher. Search (query );
Assert. areequal (2, hits. Length ());
Query = QP. parse ("/computers/JUnit *");
Hits = searcher. Search (query );
Assert. areequal (1, hits. Length (), "JUnit in action ");
}
Note that the queryparser object is used instead of the queryparser class.
The reason is that the object can be used to modify some default attributes of queryparser. For example, in the above example, our category is capitalized, while queryparser is
All query strings containing * are converted to lower-case/computer *. In this way,/computers * in the original text cannot be found *
So we need to change this default option by setting the default attribute of queryparser.
QP. setlowercasewildcardterms (false. 4. booleanqueryUsed to test whether multiple conditions are met. The following two examples are used to test whether conditions and or conditions are met. [Test] Public void and (){Termquery searchingbooks =New termquery (new term ("subject", "JUnit ")); Rangequery currentbooks =
New rangequery (new term ("pubmonth", "200301 "),
New term ("pubmonth", "200312 "),
True );
Booleanquery currentsearchingbooks = new booleanquery ();
Currentsearchingbooks. Add (searchingbooks, true, false );
Currentsearchingbooks. Add (currentbooks, true, false );
Indexsearcher searcher = new indexsearcher (directory );
Hits hits = searcher. Search (currentsearchingbooks );
Asserthitsincludetitle (hits, "JUnit in action ");
}
[Test]
Public void or ()
{
Termquery methodologybooks = new termquery (
New term ("category ",
"/Computers/JUnit "));
Termquery easternphilosophybooks = new termquery (
New term ("category ",
"/Computers/ant "));
Booleanquery enlightenmentbooks = new booleanquery ();
Enlightenmentbooks. Add (methodologybooks, false, false );
Enlightenmentbooks. Add (easternphilosophybooks, false, false );
Indexsearcher searcher = new indexsearcher (directory );
Hits hits = searcher. Search (enlightenmentbooks );
Console. Out. writeline ("OR =" + enlightenmentbooks );
Asserthitsincludetitle (hits, "Java Development with ant ");
Asserthitsincludetitle (hits, "JUnit in action ");
}When and when is or? The key lies in the parameter of the add method of the booleanquery object. Parameter 1 is the query condition to be added. Parameter 2: Does required indicate that this condition must be met? True indicates that the condition must be met. False indicates that the condition cannot be met. Does parameter 3 prohibited indicate that the condition must be rejected? True indicates that the result that meets this condition must be excluded, and false indicates that the condition can be met. There are three combinations, as shown in the following table: Booleanquery And Queryparse [Test] Public void testqueryparser (){Query query = queryparser. parse ("pubmonth: [200301 to 200312] And JUnit", "subject", new simpleanalyzer ());Indexsearcher searcher = new indexsearcher (directory );Hits hits = searcher. Search (query );Assert. areequal (1, hits. Length ());Query = queryparser. parse ("/computers/JUnit or/computers/ant", "category", new whitespaceanalyzer ());Hits = searcher. Search (query );Assert. areequal (2, hits. Length ());}Note that the size of and or can be expressed as a and-B if you want a and non-B, or + A-B. by default, queryparser considers spaces as or links, just like Google. however, you can use the queryparser object to modify this attribute. [Test]
Public void testqueryparserdefaultand ()
{
Queryparser QP = new queryparser ("subject", new simpleanalyzer ());
QP. setoperator (queryparser. default_operator_and );
Query query = QP. parse ("pubmonth: [200301 to 200312] JUnit ");
Indexsearcher searcher = new indexsearcher (directory );
Hits hits = searcher. Search (query );
Assert. areequal (1, hits. Length ());
}
5.
The phrasequery query phrase mainly contains a slop concept, that is, the displacement deviation between words,
This value will affect the score of the result. if slop is 0, it is the most matched. it is easy to understand the following example. The slop computing users do not need to understand it, but the slop is too large.
The query efficiency is affected, so we need to set this value to a smaller value in actual use. phrasequery does not care about the sequence of phrases. In addition to increasing the hit rate, phrasequery also performs
Spannearquery can be used to control the sequence of phrases to improve the performance.
[Setup]
Protected void Init ()
{
// Set up sample document
Ramdirectory directory = new ramdirectory ();
Indexwriter writer = new indexwriter (directory,
New whitespaceanalyzer (), true );
Document Doc = new document ();
Doc. Add (field. Text ("field ",
"The quick brown fox jumped over the lazy dog "));
Writer. adddocument (DOC );
Writer. Close ();
Searcher = new indexsearcher (directory );
}
Private bool matched (string [] phrase, int slop)
{
Phrasequery query = new phrasequery ();
Query. setslop (slop );
For (INT I = 0; I <phrase. length; I ++)
{
Query. Add (new term ("field", phrase [I]);
}
Hits hits = searcher. Search (query );
Return hits. Length ()> 0;
}
[Test]
Public void slopcomparison ()
{
String [] phrase = new string [] {"quick", "Fox "};
Assert. isfalse (matched (phrase, 0), "exact phrase not found "); Assert. istrue (matched (phrase, 1), "Close enough ");
}
[Test]
Public void reverse ()
{
String [] phrase = new string [] {"Fox", "quick "};
Assert. isfalse (matched (phrase, 2), "exact phrase not found "); Assert. istrue (matched (phrase, 3), "Close enough ");
}
[Test]
Public void multiple ()-
{
Assert. isfalse (matched (New String [] {"quick", "jumped", "lazy"}, 3), "Not close enough ");
Assert. istrue (matched (New String [] {"quick", "jumped", "lazy"}, 4), "just enough ");
Assert. isfalse (matched (New String [] {"lazy", "jumped", "quick"}, 7), "almost but not quite ");
Assert. istrue (matched (New String [] {"lazy", "jumped", "quick"}, 8), "bingo ");
}
Phrasequery And QueryparseWhen you use queryparse to query phrases, you must set the slop value in either of the following ways: [Test] Public void testqueryparser (){Query Q1 = queryparser. parse ("" quick Fox "","Field", new simpleanalyzer ());Hits hits1 = searcher. Search (Q1 );Assert. areequal (hits1.length (), 0 ); Query q2 = queryparser. parse ("" quick Fox "~ 1 ",// Method 1
"Field", new simpleanalyzer ());
Hits hits2 = searcher. Search (Q2 );
Assert. areequal (hits2.length (), 1 );
Queryparser QP = new queryparser ("field", new simpleanalyzer ()); QP. setphraseslop (1 );//Method 2Query Q3 = QP. parse ("quick Fox "");Assert. areequal ("quick Fox "~ 1 ", q3.tostring (" field ")," sloppy, implicitly ");Hits hits3 = searcher. Search (Q2 );Assert. areequal (hits3.length (), 1 );} 6. wildcardquery wildcard search. Note that the values of child and mildew are the same.[Test]
Public void wildcard ()
{
Indexsinglefielddocs (new field []
{
Field. Text ("contents", "wild "),
Field. Text ("contents", "child "),
Field. Text ("contents", "mild "),
Field. Text ("contents", "Mildew ")
});
Indexsearcher searcher = new indexsearcher (directory );
Query query = new wildcardquery (
New term ("contents ","? ILD *"));
Hits hits = searcher. Search (query );
Assert. areequal (3, hits. Length (), "Child no match ");
Assert. areequal (hits. Score (0), hits. Score (1), 0.0, "score the same ");
Assert. areequal (hits. Score (1), hits. Score (2), 0.0, "score the same ");
}

WildcardqueryAndWhen queryparse is used for performance consideration, wildcards cannot be used at the beginning.
Similarly, in performance consideration, only query Words Containing * at the end will be converted to prefixquery.
[Test, expectedexception (typeof (parseexception)]
Public void testqueryparserexception ()
{
Query query = queryparser. parse ("? ILD * "," contents ", new whitespaceanalyzer ());
}
[Test]
Public void testqueryparsertailasterrisk ()
{
Query query = queryparser. parse ("mild *", "contents", new whitespaceanalyzer ());
Assert. istrue (query is prefixquery );
Assert. isfalse (query is wildcardquery );
} [Test]
Public void testqueryparser ()
{
Query query = queryparser. parse ("mi? D * "," contents ", new whitespaceanalyzer ());
Hits hits = searcher. Search (query );
Assert. areequal (2, hits. Length ());
}7. fuzzyquery: the values of the two matching items are different, which is different from that of wildcardquery.
[Test]
Public void fuzzy ()
{
Query query = new fuzzyquery (new term ("contents", "wuzza "));
Hits hits = searcher. Search (query );
Assert. areequal (2, hits. Length (), "both close enough ");
Assert. istrue (hits. Score (0 )! = Hits. Score (1), "wuw.closer than fuzzy ");
Assert. areequal ("wuzzy", hits. DOC (0). Get ("contents"), "wuzza bear ");
}

Fuzzyquery
And QueryparseNote the difference between slop and phrasequery ~ Followed by a number. [Test]
Public void testqueryparser ()
{
Query query = queryparser. parse ("wuzza ~ "," Contents ", new simpleanalyzer ());
Hits hits = searcher. Search (query );
Assert. areequal (2, hits. Length (), "both close enough ");
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.