In Python, haystack is used to implement the django full-text search engine function,

Source: Internet
Author: User
Tags install django pip install django elastic search

In Python, haystack is used to implement the django full-text search engine function,

Preface

Django is a web Framework of the python language and has powerful functions. With some plug-ins, you can easily add search functions for web sites.

The search engine uses whoosh, which is a full-text search engine implemented in pure python and is small and simple.

Chinese search requires Chinese word segmentation, using jieba.

When using whoosh directly in django projects, you need to pay attention to some basic details. Using the haystack search framework, You can conveniently Add the search function to django, you do not need to pay attention to index creation, search parsing, and other details.

Haystack supports multiple search engines, not only whoosh, but also solr, elastic search and other search engines. You can also use haystack and switch the engine directly without modifying the search code.

Configure search

1. Install related packages

pip install django-haystackpip install whooshpip install jieba

2. Configure django settings

Modify the settings. py file and add the haystack application:

INSTALLED_APPS = (... 'haystack', # Put haystack at the end)

Append the haystack configuration in settings:

HAYSTACK_CONNECTIONS = {'default': {'Engine ': 'haystack. backends. whoosh_cn_backend.WhooshEngine ', 'path': OS. path. join (BASE_DIR, 'whoosh _ Index'), }}# add this item. When the database changes, the index is automatically updated, which is very convenient for HAYSTACK_SIGNAL_PROCESSOR = 'haystack. signals. realtimeSignalProcessor'

3. Add a url

In the urls. py of the entire project, configure the url path of the search function:

urlpatterns = [  ...  url(r'^search/', include('haystack.urls')),]

4. Add an index under the application directory

In the sub-application directory, create a file named search_indexes.py.

From haystack import indexes # modify here for your own modelfrom models import GoodsInfo # modify here, the class name is model class name + Index, for example, the model class is GoodsInfo, the class name here is GoodsInfoIndexclass GoodsInfoIndex (indexes. searchIndex, indexes. indexable): text = indexes. charField (document = True, use_template = True) def get_model (self): # modify here for your own model return GoodsInfo def index_queryset (self, using = None): return self. get_model (). objects. all ()

Note:

1) modify the three annotations above.

2) This file specifies how to index existing data. At get_model, you can directly put the model in django to complete the index without having to focus on Database reading, index creation, and other details.

3) text = indexes. CharField specifies which fields in the model class are indexed. use_template = True indicates that a template file will be specified later to indicate the specific fields.

5. Specify the index Template File

In the project's "templates/search/indexes/Application name/secret", create the watermark model name _text.txt.

For example, if the model class name is goodsinfo, goodsinfo_text.txt is created (all in lowercase). This file specifies which fields in the model are indexed and written into the following content: (only Chinese characters are modified, so do not change the object)

{Object. Field 1 }}{ object. Field 2 }}{ object. Field 3 }}

6. Specify the search result page

Create a search.html page under templates/search.

<! DOCTYPE html> 

7. Use the jieba Chinese Word Divider

In the installation folder of haystack, the path is "/home/python /. virtualenvs/django_py2/lib/python2.7/site-packages/haystack/backends "to create a ChineseAnalyzer. py file, write the following content:

import jiebafrom whoosh.analysis import Tokenizer, Tokenclass ChineseTokenizer(Tokenizer):  def __call__(self, value, positions=False, chars=False,         keeporiginal=False, removestops=True,         start_pos=0, start_char=0, mode='', **kwargs):    t = Token(positions, chars, removestops=removestops, mode=mode,         **kwargs)    seglist = jieba.cut(value, cut_all=True)    for w in seglist:      t.original = t.text = w      t.boost = 1.0      if positions:        t.pos = start_pos + value.find(w)      if chars:        t.startchar = start_char + value.find(w)        t.endchar = start_char + value.find(w) + len(w)      yield tdef ChineseAnalyzer():  return ChineseTokenizer()

8. Switch the backend of whoosh to Chinese Word Segmentation

Copy the whoosh_backend.py file in the backends directory above, named whoosh_cn_backend.py, open the file, and replace it with the following:

# Introduce the added Chinese Word Segmentation from. ChineseAnalyzer import ChineseAnalyzer at the top # search for analyzer = StemmingAnalyzer () throughout the py file ()

Change all

analyzer=ChineseAnalyzer()

There are about two or three places in total.

9. generate an index

Manually generate an index:

python manage.py rebuild_index

10. Implement search entries

Add the search box to the webpage:

<Form method = 'get' action = "/search/" target = "_ blank"> <input type = "text" name = "q"> <input type = "submit "value =" query "> </form>

Rich Customization

The above is just a basic search engine that can be quickly completed. haystack has more customization options to meet Personalized Requirements.

Reference official documents: http://django-haystack.readthedocs.io/en/master/

Custom Search view

In the above configuration, search-related requests are imported to haystack. urls. If you want to customize the search view to implement more functions, you can modify it.

The content in haystack. urls is actually very simple,

from django.conf.urls import url from haystack.views import SearchView urlpatterns = [   url(r'^$', SearchView(), name='haystack_search'), ] 

Then, we can write a view that inherits from the SearchView to import the search url to the custom view for processing.

Class MySearchView (SearchView): # rewrite the related variable or method template = 'search_result.html'

You can view the source code or documentation of SearchView and learn what each method is.

For example, the template variable is overwritten and the location of the template on the search result page is modified.

Highlight

In the template on the search result page, you can use the highlight tag (you need to load it first)

{% highlight <text_block> with <query> [css_class "class_name"] [html_tag "span"] [max_length 200] %}

Text_block is all text, and query is the highlighted keyword. An optional parameter is provided later. You can define the html tag of the highlighted keyword, the css class name, and the maximum length of the entire highlighted part.

The source code of the highlighted part is located haystack/templatetags/lighlight.py Andhaystack/utils/lighlighting.pyFile.

Summary

The above section describes how to use haystack in Python to implement the django full-text search engine. I hope it will be helpful to you. If you have any questions, please leave a message for me, the editor will reply to you in a timely manner. Thank you very much for your support for the help House website!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.