Xapian Study Notes 4-area search

Source: Internet
Author: User
Xapian Study Notes 4-area search

------------------------
1. What is Faceted search?

Faceted search allows users to dynamically aggregate specific attributes of hit documents queried by users. Faceted search is applicable in many places, especially in Chang electronic Mall, the user enters a query condition, and the server returns the classification information of the document hit by the query. For example, if the user queries "computer", the server returns all documents that hit the keyword "computer, the word "computer" appears in the classification of these documents, such as tablets, laptops, and desktops. This clustering is multidimensional, these hit documents may belong to different merchants, and they will also be clustered. In general, the goal of Faceted search is to provide users with a basis for filtering, so as to help users quickly find what they want.

Figure:

It has the following advantages:

  • High Information Integration: users can see the integrated information of the queried information. It is not flat information, but multi-dimensional information.
  • Result predictability: When you click a category, you can know how many results the category has hit.
  • No selection level limit: You can add or delete different aggregation category limits with different

2. How to Create a Faceted search in xapian

For each document in xapian, there are some values corresponding to it. You can put the field values to be clustered into these values, and give a unique slot number, using xapian :: document: add_value () method. For example, if you have a library database, you can place "price" at slot 0 and "author" at slot 1, "publisher" is placed at slot 2 and "publication type" is placed at slot 3. In this way, you can aggregate data by specific values during query, if the hit document is required to be hit at 100-200,200-400,400-500 by price ,.. you must note that the xapian: sortable_serialize method must be used to encode some data fields in order to be correctly sorted.

3. How to query the xapian area search

For example, if you want to perform a faceted query on price and author, you need to use

The xapian: enquire: add_matchspy () method adds the xapian: valuecountmatchspy object to it. It is mainly used to calculate the frequency of occurrence of a value in the hit document. The general code is as follows: xapian: valuecountmatchspy spy0 (0); xapian: valuecountmatchspy spy1 (1); xapian: enquire Enq (db); Enq. add_matchspy (& spy0); Enq. add_matchspy (& spy1); Enq. set_query (query); xapian: mset = Enq. get_mset (0, 10,100 00 );

10000 indicates that xapian must perform a clustering check on at least 10000 documents, and the spy object stores the partition information, which can be obtained through the following method,

   Xapian::TermIterator i;   for (i = spy0.values_begin(); i != spy0.values_end(); ++i) {    cout << *i << ": " << i.get_termfreq() << endl;   }   for (i = spy1.values_begin(); i != spy1.values_end(); ++i) {    cout << *i << ": " << i.get_termfreq() << endl;   }

* I indicates the clustering value, that is, the value added by the add_value () method above. If it is a data type, it must be decoded using the xapian: sortable_unserialize method, otherwise, the display may be messy. The get_termfreq () statement later indicates that this clustering has hit several documents, for example, the price range has hit 30 products in.

4. Reference

  • Http://xapian.org/docs/facets.html
  • Http://en.wikipedia.org/wiki/Faceted_search

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.