Implement Full-text search functionality in the Python flask framework _python

Source: Internet
Author: User
Tags form post sqlite virtual environment

Getting Started with full-text search engine

Unfortunately, the support of relational databases for Full-text search is not standardized. Different databases implement Full-text search in their own way, and SQLAlchemy does not provide a good abstraction in Full-text search.

We now use SQLite as our database, so we can bypass SQLAlchemy and use the tools provided by SQLite to create a full-text indexing index. But it's not so good because if we switch to another database one day, we have to rewrite the Full-text search method for another database.

So our solution is that we will let our existing database process regular data, and then we create a dedicated database to solve Full-text search.


There are only a few open source Full-text search engines. As far as I can tell, only one whoosh provides a flask extension, which is a full-text search engine written in Python language. The advantage of using a pure Python engine is that it can run anywhere with a Python interpreter. The disadvantage is that its search performance did not reach the C or C + + written search engine so good. In my head. The ideal solution is to have a search engine that provides a flask extension that connects most databases and, as Flask-sqlalchemy, provides a way to use most databases freely, but now looks like it has a full-text search engine. Django developers have a great extension that supports most full-text search engines, called Django-haystack. Hopefully one day some guy will be able to provide a similar extension for flask.


But now, we will implement our own Full-text search via whoosh. We will use the flask-whooshalchemy extension, which combines the whoosh database with the Flask-sqlalchemy module.

If you haven't installed the Flask-whooshalchemy extension in your virtual environment, install it now.

Windows users install with the following command:

Copy Code code as follows:
Flask\scripts\pip Install Flask-whooshalchemy

Other users install with the following command:


Copy Code code as follows:
Flask/bin/pip Install Flask-whooshalchemy

Configuration

Configuring Flask-whooshalchemy Ash is often simple. All we need to do is tell the extended Full-text search database name (fileconfig.py):

Whoosh_base = Os.path.join (basedir, ' search.db ')
modifying Modules

When combining Flask-whooshalchemy and flask-sqlalchemy, we need to be indexed when the appropriate module class (fileapp/models.py) specifies which data:

From app Import app
import flask.ext.whooshalchemy as Whooshalchemy
 
class Post (db. Model):
  __searchable__ = [' body ']
 
  id = db. Column (db. Integer, Primary_key = True) Body
  = db. Column (db. String (140))
  timestamp = db. Column (db. DateTime)
  user_id = db. Column (db. Integer, Db. ForeignKey (' User.ID ')
 
  def __repr__ (self): return
    ' <post%r> '% (self.text)
 
Whooshalchemy.whoosh _index (app, Post)

This module has a new __searchable__ field, which is a list that includes all the database fields that can be used as the search index. In our project we only need the body field of all post posts.

In this module, we must also initialize the Full-text index by invoking the Whoosh_index method.

This is not a change that can affect our relational database, so we don't need to change the new database.

Unfortunately, all blog posts are already in the database before they are added to the Full-text search engine and are not indexed. To keep the database and Full-text search engine synchronized, we'll delete all existing blog posts in the database and start over. First we open the Python interpreter. The Windows user is the following:

Copy Code code as follows:
Flask\scripts\python

Other Operating system users:


Copy Code code as follows:
Flask/bin/python

Then remove all blog posts from the python command prompt:

>>> from app.models import Post
>>> to app Import db
>>> for Post in Post.query.all () :
...  Db.session.delete (POST)
>>> Db.session.commit ()

Search

Now we're going to start doing the search. First, let's add a few blog posts to the database. We have two ways of doing this. We can add articles to the Web by opening the application like normal users, or add them directly to the Python command line.

Use the method to add from the command line:

>>> from app.models import User, Post
>>> from app Import db
>>> import datetime
& gt;>> u = User.query.get (1)
>>> p = Post (body= ' My I-post ', Timestamp=datetime.datetime.utcnow (), Author=u)
>>> db.session.add (P)
>>> P = post (body= ' My second post ', timestamp= Datetime.datetime.utcnow (), author=u)
>>> db.session.add (P)
>>> p = Post (body= ' My third and Last Post ', Timestamp=datetime.datetime.utcnow (), author=u)
>>> db.session.add (p)
>>> Db.session.commit ()

Flask-whooshalchemy This extension is very good because it can connect Flask-sqlalchemy and submit automatically. We do not need to maintain full-text indexing because it has obviously helped us do this.


Now that we have some articles in the Full-text index, we can search:

>>> Post.query.whoosh_search (' Post '). All ()
[<post u ' I second post ', <post u ' my ' I-A-post ' > , <post u ' i third and last Post ']
>>> Post.query.whoosh_search (' second '). All ()
[<post u ' my Second post ']
>>> Post.query.whoosh_search (' second OR last '). All ()
[<post u ' my second post ' <post U ' my third and Last Post ';]

As you can see from the example above, the query does not need to be restricted to a single word. In fact, Whoosh provides a beautiful and powerful search query language (search queries language).

consolidating Full-text retrieval to applications

In order for users of our applications to use search capabilities, we need to add a little bit of change.
Configuration

As far as configuration is concerned, we just need to specify the maximum number of search results returned (fileconfig.py):

Max_search_results = 50

Search the form

We need to add a search box to the navigation bar at the top of the page. It's great to put the search box to the top because all pages have a search box (note: All page public navigation bars).

First we add a single search Form Class (fileapp/forms.py):

Class Searchform (Form):
  search = TextField (' search ', validators = [Required ()])

Then we need to add a search form object and make it available to all templates because we want to put the search form in the common navigation bar of all the pages. The easiest way to do this is to create a form on the before_request handler and upload it to the FLASK global variable G (fileapp/views.py):

@app. Before_request
def before_request ():
  g.user = Current_User
  if g.user.is_authenticated ():
    G.user.last_seen = Datetime.utcnow ()
    db.session.add (g.user)
    db.session.commit ()
    g.search_form = Searchform ()

Then we add the form to our template (fileapp/templates/base.html):


<div>microblog:
  <a href= ' {{url_for (' index ')} ' >Home</a>
  {% if g.user.is_authenticated () %}
  | <a href= "{{url_for (' user ', nickname = G.user.nickname)}}" >your profile</a>
  | <form style= Display:inline "action=" {{url_for (' Search ')}} "method=" POST "name=" Search ">{{g.search_form.hidden_tag ()}}{{ G.search_form.search (size=20)}}<input type= "Submit" value= "search" ></form>
  | <a href= "{{url_for (' Logout ')}} " >Logout</a>
  {% endif%}
</div>

Note that this search box is only displayed when a user logs on. Similarly, Before_request handler creates a form only when a user logs on, because our application does not show any content to a user who is not authenticated.

Search Display method (searching view Funciton)

Above we have set the action field of form, which sends all search requests to the Lookup view method. This is where we want to perform full-text search queries (fileapp/views.py):

@app. Route ('/search ', methods = [' POST '])
@login_required
def search ():
  if not g.search_form.validate_on _submit (): Return
    Redirect (url_for (' index ')) return
  Redirect (Url_for (' search_results ', query = G.search_ Form.search.data))

This method does not do much, it just collects the search query fields from the form, and then passes the fields as parameters to the query method, and finally redirects to another page. The reason for not being here directly is that if a user clicks on the Refresh button, the browser pops up a warning window that the form data will be resubmitted. So when a POST request's response results are redirected, this warning is avoided because the browser's refresh button will be reloaded on the redirected page after redirection.


Search Results Page

Once a query field is accepted, the form POST handler sends it to Search_result handler (fileapp/views.py) via page redirection:

@app. Route ('/search_results/<query> ')
@login_required
def search_results (query):
  results = Post.query.whoosh_search (query, Max_search_results). All () return
  render_template (' search_results.html ',
    query = query,
    results = results)

Then the search results display method sends this query to whoosh, which is the maximum number of search results because we don't want to present a large number of results pages, so we only display the first 50 data.


The last part needs to be completed is the template for the search results (fileapp/templates/search_results.html):

<!--extend Base layout-->
{% extends "base.html"%}
 
{% block content%}
 
 

Here, we can reuse our post.html page, so we don't have to worry about replacing a new page or other formatting page elements, because all of these are common methods in sub-template.

Postscript

We now have a complete, very important, often overlooked feature, which is what any good Web application must have.

This is the time to update the micro-blog application (Air breathing ... Source code you can find from here:

Microblog-0.10.zip

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.