PostgreSQL inside to full-text search or fuzzy query plus index speed, there will generally be two options, one is gist type, one is the gin type, the official website gives the following reference:
There is substantial performance differences between the both index types, so it's important to understand their characte Ristics. A GiST index is lossy, meaning then the index may produce false matches, and it's necessary to check the actual table row To eliminate such false matches. (PostgreSQL does this automatically when needed.) GiST indexes was lossy because each document was represented in the index by a fixed-length signature. The signature is generated by hashing each word to a single bit in an n-bit string with all these bits or-ed together t o Produce an n-bit document signature. When both words hash to the same bit position there would be a false match. If all words in the query has matches (real or false) then the table row must being retrieved to see if the match is correct . Lossiness causes performance degradation due to unnecessary fetches of the table records that turn off to be false matches. Since random access to table records are slow, this limits the usefulness of GiST INDEXEs. The likelihood of false matches depends on several factors, in particular the number of unique words, so using Dictionarie s to reduce the is recommended. GIN indexes is not lossy to standard queries, but their performance depends logarithmically on the number of unique word S. (However, GIN indexes store only the words (lexemes) of tsvector values, and not their weight labels. Thus a table row recheck is needed when using a query that involves weights.) In choosing which index type to use, GiST or GIN, consider these performance Differences:gin index lookups is about three Times faster than Gistgin indexes take on three times longer to build than Gistgin indexes is moderately slower to up Date than GiST indexes, but about ten times slower if fast-update support is disabled (see sections 54.3.1 for details) GIN Indexes is two-to-three times larger than GiST indexesas a rule of thumb, GIN indexes is best for static data because Lo Okups is faster. For dynamic Data, GiSTIndexes is faster to update. Specifically, GiST indexes is very good for dynamic data and fast if the number of unique words (lexemes) is under 100,00 0, while GIN indexes would handle 100,000+ lexemes better but is slower to update. Note that GIN index build time can often was improved by increasing maintenance_work_mem, while GiST index build time is no T sensitive to that parameter
Reference: http://www.postgresql.org/docs/9.2/static/textsearch-indexes.html
- This article is from: Linux Learning Tutorial Network
Index selection for PostgreSQL