Explore the efficiency of select in postgresql, selectpostgresql

Last Update:2016-04-15 Source: Internet

Author: User

Tags psql

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Explore the efficiency of select in postgresql, selectpostgresql

You can see this problem in zhihu:

How does MySQL query select * from table where id in (hundreds or thousands of IDS) to improve efficiency? Modify

E-commerce websites, a commodity Attribute Table, hundreds of thousands of records, 80 M, index only primary key id, how to improve the efficiency of such a query?

Select * from table where id in (hundreds or thousands of IDS)

These IDS are not regular, scattered ....

After reading the answer, I felt a lot unreliable, but I was speechless. So I wrote a few queries on my computer to test it. I use Postgresql9.4, But I think mysql should be similar. First, create a simple table with only three simple columns. Many people mentioned the need to check the table size below this problem, in fact, this problem has nothing to do with the table size and is only related to the index size. Because index is built on int, it is only related to the number of records.

Table "public.t9"Column | Type | Modifiers--------+----------------+-----------c1 | integer |c2 | character(100) |c3 | character(200) |Indexes:"i1" UNIQUE, btree (c1)insert into t9 values(generate_series(1000,500000,1),repeat('a',90),repeat('b',180));

Then some random numbers are generated. jot is used for Mac, and shuf is used for Linux.

for ((i=0;i<100000;i++))dojot -r 1 1000 600000 >>rand.filedone

Then the query statement is generated based on rand. file:

select * from t9 where c1 in (494613,575087,363588,527650,251670,343456,426858,202886,254037,...1);

Three SQL files are generated, and the number of variables in the in is 10000, and respectively. Run these three SQL files and check the time.

try psql study -f test_100.sql -o /dev/nullLOG: duration: 2.879 mstry psql study -f test_1000.sql -o /dev/nullLOG: duration: 11.974 mstry psql study -f test_10000.sql -o /dev/nullLOG: duration: 355.689 ms

We can see that the data time changes a lot only when the number of data in reaches 10,000, but it is completed in more than 300 ms.

In some cases, create a temporary table and then use in subquery. What if I want to join two tables at this time? For simplicity, I joined two tables directly.

drop table t_tmp;create table t_tmp(id int);insert into t_tmp (id) values(494613),(575087),(363588),(345980),...(1);select t9.* from t9, t_tmpwhere t9.c1 = t_tmp.id;

What is the time?

try psql study -f test_create_10000.sql -o /dev/nullLOG: duration: 2.078 msLOG: duration: 1.233 msLOG: duration: 224.112 msLOG: duration: 322.108 ms

Besides the drop and create time, it still takes more than 500 of the time. The premise here is the ssd disk I used, so the LOG writing time will be much faster. Why is it so slow? You can use the explain command to check whether the data volume is large at this time and the Merge join operation is used directly.

What is the efficiency of the 1000 rows of data?

try psql study -f test_create_1000.sql -o exp.outLOG: duration: 2.476 msLOG: duration: 0.967 msLOG: duration: 2.391 msLOG: duration: 8.780 ms

The data in Row 3 is as follows:

try psql study -f test_create_100.sql -o /dev/nullLOG: duration: 2.020 msLOG: duration: 1.028 msLOG: duration: 1.074 msLOG: duration: 1.912 ms

We can see that the create table method is no better than writing all the variables directly in 100 and 1000 values, and the NLJ is used in the explain statement. However, when the data volume is large (according to the original problem, the number of in statements is unpredictable), the efficiency will only be lower. in addition, the additional table maintenance costs and extra SQL statements will be added, DBAs certainly do not like it. Believe in databases and use in list to solve these problems with confidence.

The above content is aimed at the efficiency of select in postgresql, and I hope to help you!

Articles you may be interested in:

Input + select (multiple) Implement the input values in the drop-down box
Problems encountered when using select last_insert_id () in Spring
Solve the option insertion BUG of the select label innerHTML in IE (compatible with IE, FF, Opera, Chrome, Safari)
How to sort Mysql select in by id
Parsing the use of insert into select in MySQL
Introduction to the use and difference between insert into select and select
Detailed description of precise timer using select in linux
Solve the Problem of automatically executing select last_insert_id () after insert data in Hibernate JPA ()
Detailed description of the differences between select into from and insert into select for database data insertion
PostgreSQL tutorial (6): Explanation of functions and operators (2)
PostgreSQL tutorial (7): Explanation of functions and operators (3)
PostgreSQL tutorial (16): System View Details

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More