Index selection for PostgreSQL

Source: Internet
Author: User
Tags postgresql

PostgreSQL inside to full-text search or fuzzy query plus index speed, there will generally be two options, one is gist type, one is the gin type, the official website gives the following reference:

There is substantial performance differences between the both index types, so it's important to understand their characte Ristics.  A GiST index is lossy, meaning then the index may produce false matches, and it's necessary to check the actual table row To eliminate such false matches. (PostgreSQL does this automatically when needed.) GiST indexes was lossy because each document was represented in the index by a fixed-length signature. The signature is generated by hashing each word to a single bit in an n-bit string with all these bits or-ed together t o Produce an n-bit document signature. When both words hash to the same bit position there would be a false match. If all words in the query has matches (real or false) then the table row must being retrieved to see if the match is correct . Lossiness causes performance degradation due to unnecessary fetches of the table records that turn off to be false matches. Since random access to table records are slow, this limits the usefulness of GiST INDEXEs. The likelihood of false matches depends on several factors, in particular the number of unique words, so using Dictionarie s to reduce the is recommended. GIN indexes is not lossy to standard queries, but their performance depends logarithmically on the number of unique word S. (However, GIN indexes store only the words (lexemes) of tsvector values, and not their weight labels. Thus a table row recheck is needed when using a query that involves weights.)  In choosing which index type to use, GiST or GIN, consider these performance Differences:gin index lookups is about three Times faster than Gistgin indexes take on three times longer to build than Gistgin indexes is moderately slower to up Date than GiST indexes, but about ten times slower if fast-update support is disabled (see sections 54.3.1 for details) GIN Indexes is two-to-three times larger than GiST indexesas a rule of thumb, GIN indexes is best for static data because Lo Okups is faster. For dynamic Data, GiSTIndexes is faster to update. Specifically, GiST indexes is very good for dynamic data and fast if the number of unique words (lexemes) is under 100,00 0, while GIN indexes would handle 100,000+ lexemes better but is slower to update. Note that GIN index build time can often was improved by increasing maintenance_work_mem, while GiST index build time is no T sensitive to that parameter

Reference: http://www.postgresql.org/docs/9.2/static/textsearch-indexes.html

    • This article is from: Linux Learning Tutorial Network

Index selection for PostgreSQL

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.