Full-text search in the Python Flask framework

Source: Internet
Author: User
Tags form post
This article describes how to implement full-text search in the Python Flask framework. this basic web function is very simple to implement. For more information, see Getting started with the full-text search engine

Unfortunately, the full-text search support of relational databases is not standardized. Different databases use their own methods for full-text retrieval, and SQLAlchemy does not provide a good abstraction for full-text retrieval.

We now use SQLite as our database, so we can bypass SQLAlchemy and use the tools provided by SQLite to create a full-text search index. But this is not very good, because if we use another database one day, we have to rewrite the full-text retrieval method of another database.

So our solution is to let our existing database process regular data, and then we create a dedicated database to solve full-text retrieval.


There are only a few open-source full-text search engines. I know that only one Whoosh provides Flask extension, which is a full-text search engine written in Python. The advantage of using the pure Python engine is that it can run anywhere with a Python interpreter. The disadvantage is that its search performance is not as good as the search engine written in C or C ++. In my mind, the ideal solution is to have a search engine that provides Flask Extensions to connect to most databases, it also provides a method like Flask-SQLAlchemy that can freely use most databases, but now it seems that there is such a full-text search engine. Django developers have a great idea. they support most full-text search engine extensions, called django-haystack. One day, I hope a guy could provide a similar extension for Flask.


But now we will use Whoosh to implement our own full-text retrieval. We will use the Flask-WhooshAlchemy extension, which combines the Whoosh database with the Flask-SQLAlchemy module.

If you have not installed the Flask-WhooshAlchemy extension in your virtual environment, install it immediately.

Run the following command to install Windows:

The code is as follows:

Flask \ Scripts \ pip install Flask-WhooshAlchemy

Other users use the following command to install the SDK:


The code is as follows:

Flask/bin/pip install Flask-WhooshAlchemy

Configuration

It is often easy to configure Flask-WhooshAlchemy. We only need to tell the name of the extended full-text retrieval database (fileconfig. py ):

WHOOSH_BASE = OS. path. join (basedir, 'search. db ')
Modify Module

When combining Flask-WhooshAlchemy and Flask-SQLAlchemy, we need to specify which data needs to be indexed in the appropriate module class (fileapp/models. py:

from app import appimport flask.ext.whooshalchemy as whooshalchemy class Post(db.Model):  __searchable__ = ['body']   id = db.Column(db.Integer, primary_key = True)  body = db.Column(db.String(140))  timestamp = db.Column(db.DateTime)  user_id = db.Column(db.Integer, db.ForeignKey('user.id'))   def __repr__(self):    return '
 
  ' % (self.text) whooshalchemy.whoosh_index(app, Post)
 

This module has a new _ searchable _ field, which is a list containing all database fields that can be used as search indexes. In our project, we only need the body field of all posts.

In this module, we must also call the whoosh_index method to initialize the full-text index.

This is not a change that can affect our relational database, so we do not need to change the database.

Unfortunately, all blog posts exist in the database before the full-text search engine is added, and are not indexed. To keep the database synchronized with the full-text search engine, we will delete all existing blog posts in the database and start again. First, open the Python interpreter. Windows users:

The code is as follows:

Flask \ Scripts \ python

Other operating system users:


The code is as follows:

Flask/bin/python

Then delete all blog posts from the Python command prompt:

>>> from app.models import Post>>> from app import db>>> for post in Post.query.all():...  db.session.delete(post)>>> db.session.commit()

Search

Now let's start searching. First, let's add several blog articles to the database. We have two ways to do this. You can open an application on a webpage to add an article, or directly add it in the Python command line.

Use the following method to add from the command line:

>>> from app.models import User, Post>>> from app import db>>> import datetime>>> u = User.query.get(1)>>> p = Post(body='my first post', timestamp=datetime.datetime.utcnow(), author=u)>>> db.session.add(p)>>> p = Post(body='my second post', timestamp=datetime.datetime.utcnow(), author=u)>>> db.session.add(p)>>> p = Post(body='my third and last post', timestamp=datetime.datetime.utcnow(), author=u)>>> db.session.add(p)>>> db.session.commit()

The Flask-WhooshAlchemy extension is very good because it can connect to Flask-SQLAlchemy and then submit automatically. We do not need to maintain the full-text index, because it has obviously helped us with this.


Now we have some articles in the full-text index. you can search for them:

>>> Post.query.whoosh_search('post').all()[
 
  , 
  
   , 
   
    ]>>> Post.query.whoosh_search('second').all()[
    
     ]>>> Post.query.whoosh_search('second OR last').all()[
     
      , 
      
       ]
      
     
    
   
  
 

The example above shows that the query does not need to be limited to one word. In fact, Whoosh provides a beautiful and powerful search query language ).

Integrate Full-text retrieval to applications

In order for our application users to use the search function, we need to add a small change.
Configuration

In terms of configuration, we only need to specify the maximum number of returned results (fileconfig. py ):

MAX_SEARCH_RESULTS = 50

Search form

We need to add a search box in the navigation bar at the top of the page. It is excellent to put the search box on the top, because all pages have a search box (note: All pages share the navigation bar ).

First, we add a search form class (fileapp/forms. py ):

class SearchForm(Form):  search = TextField('search', validators = [Required()])

Then we need to add a search form object and make it available to all templates because we need to place the search form in the common navigation bar of all pages. The simplest way to do this is to create a form on before_request handler and upload it to the global variable g (fileapp/views. py) of Flask ):

@app.before_requestdef before_request():  g.user = current_user  if g.user.is_authenticated():    g.user.last_seen = datetime.utcnow()    db.session.add(g.user)    db.session.commit()    g.search_form = SearchForm()

Then we add form to our template (fileapp/templates/base.html ):


Microblog: Home {% if g.user.is_authenticated() %} | Your Profile | | Logout {% endif %}

Note: This search box is displayed only when a user logs on. Similarly, before_request handler creates a form only when a user logs on. this is because our application does not display any content to unauthenticated users.

Search view funciton)

We have set the form action field above. it will send all the search requests to the search view method. This is where we want to perform full-text search and query (fileapp/views. py ):

@app.route('/search', methods = ['POST'])@login_requireddef search():  if not g.search_form.validate_on_submit():    return redirect(url_for('index'))  return redirect(url_for('search_results', query = g.search_form.search.data))

This method does not do much. It just collects search query fields from the form, transmits these fields as parameters to the query method, and finally redirects them to another page. If a user clicks the refresh button, the browser will pop up the warning window "form data will be submitted again. Therefore, when the response result of a POST request is redirected, this warning is avoided because the browser refresh button will be reloaded on the redirected page.


Search result page

Once a query field is accepted, form POST handler will send it to search_result handler (fileapp/views. py) through page redirection ):

@app.route('/search_results/
 
  ')@login_requireddef search_results(query):  results = Post.query.whoosh_search(query, MAX_SEARCH_RESULTS).all()  return render_template('search_results.html',    query = query,    results = results)
 

Then the search result display method will send this query to Whoosh. the parameter is the maximum number of search results, because we do not want to display a large number of results pages, so we only display the first 50 pieces of data.


The last part is the search result template (fileapp/templates/search_results.html ):

 {% extends "base.html" %} {% block content %}Search results for "{{query}}":{% for post in results %}  {% include 'post.html' %}{% endfor %}{% endblock %}

Now we can re-use our post.html page, so we don't have to worry about replacing a new page or other page elements, because all these are common methods in the sub-template.

Postscript

We now have a complete, very important, and often overlooked function, which is also a must for any excellent web application.

You can find the source code of the updated micro-blog application at this time:

Microblog-0.10.zip

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.