Efficiency issues with SELECT in PostgreSQL

Source: Internet
Author: User
Tags mysql query psql rand

On the understanding, I see a question like this:

MySQL Query select * FROM table where ID in (hundreds of or thousands of IDs) how to improve efficiency? Modify the e-commerce website, a product attribute table, hundreds of thousands of records, 80M, index only the primary key ID, do such a query how to improve efficiency?
SELECT * FROM table where ID in (hundreds of or thousands of IDs)
These IDs are not regular, scattered ....

Look at the answer, feel a lot of unreliable, but accreditations, so on my computer wrote a few query test. I use Postgresql9.4, but I feel that MySQL should also be similar, first create a simple table, only a simple 3 columns, under the question many people mentioned the need to look at the size of the table, in fact, the problem is not related to the size of the table, only the size of the index, because the index is built on the int, so only and The number of records concerned.
 Table" Public. T9 "Column |Type|Modifiers--------+----------------+-----------C1| integer        |C2| character( -)|C3| character( $)|Indexes: "I1"UNIQUE, Btree (C1)

INSERT into T9 values (Generate_series (1000,500000,1), repeat (' a ', ' n '), repeat (' B ', 180));

Then generate some random numbers on Mac with Jot,linux on Shuf

 for ((i=0;i<100000; i++))  Do  1600000 >>rand. file  Done

Then generate a query statement based on Rand.file:

Select *  fromT9whereC1inch (494613,575087,363588,527650,251670,343456,426858,202886,254037,...1);

Generate 3 SQL files respectively, in the number of variables are 100,1000 and 10,000, execute these 3 SQL files, see the time

?  Try Psql study-f Test_100.sql-o/dev/nullLOG:  2.879  ms?   -F Test_1000.sql-o/dev/nullLOG  :11.974  ms  ? -F Test_10000.sql-o/dev/nullLOG  :355.689 ms

You can see that the data time will vary greatly only when the data in the in is 10,000, but it is only done in more than 300 Ms.

So if you follow some of the answers, build a temporary table, then use in subquery, and hopefully you can join two tables at this time? For the sake of simplicity, I directly join with two tables.

Drop Tablet_tmp;Create TableT_tmp (IDint);Insert  intoT_tmp (ID)Values(494613),(575087),(363588),(345980),
... (1)
;SelectT9.* fromT9, T_tmpwhereT9.c1=T_tmp.id;

How about the time?

- - /Dev/nullLOG:  2.078  mslog:  1.233  mslog:  224.112  mslog:  322.108 Ms

Except for the drop and create time, still spent a lot of time, the premise is still my SSD disk, so write log time will be much faster. Why is it so slow? Take a look at the explain, this time the data volume is large, directly go to merge Join

What about the efficiency of the 1000 rows of data?

- - Exp . out log:  2.476  mslog:  0.967  mslog:  2.391  msLOG:  8.780 ms

100 rows of data are as follows:

?  Try Psql study-f Test_create_100.sql-o/dev/nullLOG:  2.020  mslog:    1.028  mslog:  1.074  mslog:  1.912 ms

You can see that the way to create a table with 100 values and 1000 values is not much better than writing all the variables directly in the in, explain is using NLJ. But in the case of a larger amount of data (according to the original problem, where the number in fact is not predictable) the efficiency will be lower, coupled with additional table maintenance costs and redundant SQL statements, the DBA certainly do not like, or trust the database, be assured that the bold directly in the list to solve these problems.

Efficiency issues with SELECT in PostgreSQL

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.