Implement full-Text search functionality in Python's flask framework

Source: Internet
Author: User
Tags form post virtual environment
Getting started with the full-text search engine

Unfortunately, the relational database support for full-text retrieval has not been standardized. Different databases implement full-text retrieval in their own way, and SQLAlchemy does not provide a good abstraction for full-text retrieval.

We now use SQLite as our database, so we can bypass SQLAlchemy and use the tools provided by SQLite to create a full-text index of indexing. But this is not good, because if one day we switch to other databases, then we have to rewrite another database of the full-text retrieval method.

So our solution is that we will have our existing database process the regular data, and then we create a dedicated database to resolve the full-text search.


There are very few open-source full-text search engines. As far as I'm told, only one whoosh provides an extension of flask, which is a full-text search engine written in Python language. The advantage of using a pure Python engine is that it can run anywhere with a Python interpreter. The downside is that its search performance is not as good as a search engine written in C or C + +. It's in my head. The ideal solution is to have a search engine that provides flask extensions that can connect most databases and, like Flask-sqlalchemy, provide a free way to use most databases, but now looks like a full-text search engine. The Django developer has a great extension that supports most full-text search engines, called Django-haystack. Hopefully someday some guy will be able to provide a similar extension for flask.


But for now, we will implement our own full-text search through whoosh. We will use the flask-whooshalchemy extension, which makes the whoosh database and the Flask-sqlalchemy module together.

If you haven't installed the Flask-whooshalchemy extension in your virtual environment yet, install it now.

Windows users install with the following command:

Copy the Code code as follows:

Flask\scripts\pip Install Flask-whooshalchemy

Other users are installed with the following command:


Copy the Code code as follows:

Flask/bin/pip Install Flask-whooshalchemy

Configuration

Configuring Flask-whooshalchemy Ash is often simple. All we need to do is tell the extended full-text retrieval database name (fileconfig.py):

Whoosh_base = Os.path.join (basedir, ' search.db ')
modifying modules

When combining Flask-whooshalchemy and flask-sqlalchemy, we need to be indexed in the appropriate module class (fileapp/models.py) to specify which data:

From app import Appimport Flask.ext.whooshalchemy as Whooshalchemy class Post (db. Model):  __searchable__ = [' body ']   id = db. Column (db. Integer, Primary_key = True)  BODY = db. Column (db. String (  timestamp) = db. Column (db. DateTime)  user_id = db. Column (db. Integer, Db. ForeignKey (' User.ID ')   def __repr__ (self):    return '
 
  
   
  % (self.text) Whooshalchemy.whoosh_index (app, Post)
 
  

This module has a new __searchable__ field, which is a list that includes all the database fields that can be used as the search index. In our project we only need the body field for all posts in the article.

In this module, we also have to initialize the full-text index by calling the Whoosh_index method.

This is not a change that can affect our relational database, so we don't need to switch to a new database.

Unfortunately, all blog posts already exist in the database before they are added to the full-text search engine and are not indexed. To maintain synchronization between the database and the full-text search engine, we will delete all existing blog posts in the database and start over again. First we open the Python interpreter. The Windows user is the following:

Copy the Code code as follows:

Flask\scripts\python

Other Operating system users:


Copy the Code code as follows:

Flask/bin/python

Then delete all the blog posts at the python command prompt:

>>> from app.models import post>>> from app import db>>> to Post in Post.query.all ():  ... Db.session.delete (POST) >>> Db.session.commit ()

Search

Now let's start the search. First, let's add a few blog posts to the database. We have two ways of doing this. We can use the Web page to open an app like a regular user to add an article, or add it directly to the Python command line.

Use this method to add from the command line:

>>> from app.models import User, post>>> from app import db>>> import datetime>>> u = User.query.get (1) >>> P = post (body= ' My first Post ', Timestamp=datetime.datetime.utcnow (), author=u) >> > Db.session.add (P) >>> P = post (body= ' My second post ', Timestamp=datetime.datetime.utcnow (), author=u) > >> Db.session.add (p) >>> P = post (body= ' My third and last Post ', Timestamp=datetime.datetime.utcnow (), Author=u) >>> Db.session.add (p) >>> db.session.commit ()

Flask-whooshalchemy This extension is very good, because it can connect Flask-sqlalchemy and then automatically commit. We don't need to maintain full-text indexing because it has obviously helped us do this.


Now that we have some articles in the full-text index, we can search for them:

>>> Post.query.whoosh_search (' Post '). All () [
 
  
   
  , 
  
   
    
   , 
   
    
     
    ]>>> Post.query.whoosh_search (' second '). All () [
    
     
      
     ]>>> Post.query.whoosh_search (' second OR last ') . All () [
     
      
       
      , 
      
       
         ] 
       
     
      
    
     
   
    
  
   
 
  

As the above example shows, the query does not need to be limited to a single word. In fact, Whoosh provides a beautiful and powerful search query Language (language).

Consolidate full-text search to applications

In order for our application users to use the search function, we also need to add a little bit of change.
Configuration

As far as configuration is concerned, we just need to specify the maximum number of search result returns (fileconfig.py):

Max_search_results = 50

Search Form

We need to add a search box to the navigation bar at the top of the page. It's great to put the search box to the top because all the pages have a search box (note: All page public navigation bars).

First we add a search Form Class (fileapp/forms.py):

Class Searchform (Form):  search = TextField (' search ', validators = [Required ()])

Then we need to add a search form object and make it available to all the templates, because we're going to put the search form in the common navigation bar of all the pages. The simplest way to do this is to create a form on Before_request handler and then upload it to the FLASK global variable G (fileapp/views.py):

@app. Before_requestdef before_request ():  g.user = Current_User  if g.user.is_authenticated ():    G.user.last_seen = Datetime.utcnow ()    db.session.add (g.user)    db.session.commit ()    g.search_form = Searchform ()

Then we add a form to our template (fileapp/templates/base.html):


Microblog:  Home  {% if g.user.is_authenticated ()%}  | Your Profile  |   | Logout  {% endif%}

Note that we only show this search box when a user is logged in. Similarly, Before_request handler creates a form only when a user is logged in, because our application does not show anything to unauthenticated users.

Search Display method (searching view Funciton)

Above we have set the action field of the form, it will send all the search requests to the searching view method. That's where we're going to execute the full-text search query (fileapp/views.py):

@app. Route ('/search ', methods = [' POST ']) @login_requireddef search ():  if not g.search_form.validate_on_submit ():    return Redirect (Url_for (' index '))  return Redirect (Url_for (' search_results ', query = G.search_ Form.search.data))

This method does not do a lot of things, it just collects the fields of the search query from the form, then passes these fields as parameters to the query method, and finally redirects to another page. The reason why the query is not made directly here is that if a user clicks the Refresh button, the browser pops up a warning window that the form data will be resubmitted. So when the response of a POST request is redirected, this warning is avoided because the browser's refresh button will be re-loaded on the redirected page after the redirect.


Search results Page

Once a query field is accepted, the form POST handler will send it to Search_result handler (fileapp/views.py) via page redirection:

@app. Route ('/search_results/
 
  
   
  ') @login_requireddef search_results (query):  results = Post.query.whoosh_search (query, Max_search_results). All ()  return render_template (' search_results.html ',    query = query,    results = results)
 
  

Then the search results display method will send this query to whoosh, the parameter is the largest number of search results, because we do not want to render a large number of results page, so we only display the first 50 data.


The final part needs to be done with the search results template (fileapp/templates/search_results.html):

 
  {% extends "base.html"%} {% block content%}

Search results for "{{query}}":

{% for post in results%} {% include ' post.html '%} {% ENDFOR%} {% Endblock%}

Here, we can re-use our post.html page, so we don't have to worry about replacing a new page or other formatting page elements, because all of these are common methods in sub-template.

Postscript

We now have a complete, very important, and often overlooked feature, which is a feature that any good Web application must have.

This moment updates the micro-blogging app (Breathe in ... ) Source code you can find from here:

Microblog-0.10.zip

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.