Discussion on the efficiency of select in on PostgreSQL _mssql

Source: Internet
Author: User
Tags mysql query postgresql psql rand

To see this problem on the understanding that:

How does MySQL query select * FROM table where ID in (hundreds of or thousands of IDs) increase efficiency? Modify

Electronic Business website, a commodity property sheet, hundreds of thousands of records, 80M, index only primary key ID, do such a query how to improve efficiency?

SELECT * FROM table where ID in (hundreds of or thousands of ID)

These IDs are not a regular, dispersed ....

Look at the answer, feel a lot of unreliable, but Kothe, so on my computer wrote a few query test. I'm using Postgresql9.4, but I feel like MySQL should be about the same, first create a simple table, only a simple 3 columns, a lot of people under this problem mentioned the need to look at the size of the table, in fact, this problem and the size of the table, and only the size of the index, because it is the index is based on the int, so just and The number of records.

Table "Public.t9"
Column | Type | Modifiers
--------+----------------+-----------
C1 |
C2 | Character (MB) |
C3 | Character (MB) |
Indexes:

Then generate some random numbers, Mac on the Jot,linux with the Shuf

For ((i=0;i<100000;i++))
do
jot-r 1 1000 600000 >>rand.file

Then generate query statements based on Rand.file:

SELECT * FROM T9 where C1 in (
494613,
575087,
363588,
527650,
251670
, 343456, 426858,< c12/>202886,
254037,
...
1
);

Generate 3 SQL files respectively, in the number of variables are 100,1000 and 10,000, execute the 3 SQL files, look at the time

Try Psql study-f Test_100.sql-o/dev/null
log:duration:2.879 Ms
try Psql study-f Test_1000.sql-o
log:duration:11.974 Ms
try Psql study-f Test_10000.sql-o/dev/null

You can see that only in the data in the 10,000 when the data time will be relatively large changes, but it is only in more than 300 MS completed.

What if, according to some answers, you build a temporary table and then use in subquery, and hope that you can join two tables at this time? For simplicity, I'm just using two tables to join.

drop table t_tmp;
CREATE TABLE t_tmp (id int);
INSERT into t_tmp (ID) values
(494613), (
575087), (
363588),
(345980),...
(1);
Select t9.* from T9, t_tmp

What about the time?

Try Psql study-f Test_create_10000.sql-o/dev/null log:duration:2.078 ms log:duration:1.233 ms Log:d
uration:224.112 ms

Remove the drop and create time, still spend 500+ time, here is the premise or I use SSD disk, so write log time will be much faster. Why is it so slow? Use explain to see, this time the data volume is big, go directly merge Join

What about the efficiency of the 1000-line data?

Try Psql study-f Test_create_1000.sql-o exp.out log:duration:2.476 ms log:duration:0.967 ms Log:dura
tion:2.391 ms

The data for line 100 are as follows:

Try Psql study-f Test_create_100.sql-o/dev/null log:duration:2.020 ms log:duration:1.028 ms Log:durati
on:1.074 ms

You can see that in the case of 100 values and 1000 values, the method of CREATE table is not much better than writing all the variables directly in, explain is using NLJ. But in a larger amount of data (according to the original question, in the case of the number of in fact can not predict the efficiency will only be lower, coupled with additional table maintenance costs and redundant SQL statements, DBAs certainly do not like, or believe that the database, feel free to boldly use in the list to fix these problems.

The above content is for select in the PostgreSQL efficiency problem, hope to be helpful to everybody!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.