Explore the efficiency of select in postgresql, selectpostgresql
You can see this problem in zhihu:
How does MySQL query select * from table where id in (hundreds or thousands of IDS) to improve efficiency? Modify
E-commerce websites, a commodity Attribute Table, hundreds of thousands of records, 80 M, index only primary key id, how to improve the efficiency of such a query?
Select * from table where id in (hundreds or thousands of IDS)
These IDS are not regular, scattered ....
After reading the answer, I felt a lot unreliable, but I was speechless. So I wrote a few queries on my computer to test it. I use Postgresql9.4, But I think mysql should be similar. First, create a simple table with only three simple columns. Many people mentioned the need to check the table size below this problem, in fact, this problem has nothing to do with the table size and is only related to the index size. Because index is built on int, it is only related to the number of records.
Table "public.t9"Column | Type | Modifiers--------+----------------+-----------c1 | integer |c2 | character(100) |c3 | character(200) |Indexes:"i1" UNIQUE, btree (c1)insert into t9 values(generate_series(1000,500000,1),repeat('a',90),repeat('b',180));
Then some random numbers are generated. jot is used for Mac, and shuf is used for Linux.
for ((i=0;i<100000;i++))dojot -r 1 1000 600000 >>rand.filedone
Then the query statement is generated based on rand. file:
select * from t9 where c1 in (494613,575087,363588,527650,251670,343456,426858,202886,254037,...1);
Three SQL files are generated, and the number of variables in the in is 10000, and respectively. Run these three SQL files and check the time.
try psql study -f test_100.sql -o /dev/nullLOG: duration: 2.879 mstry psql study -f test_1000.sql -o /dev/nullLOG: duration: 11.974 mstry psql study -f test_10000.sql -o /dev/nullLOG: duration: 355.689 ms
We can see that the data time changes a lot only when the number of data in reaches 10,000, but it is completed in more than 300 ms.
In some cases, create a temporary table and then use in subquery. What if I want to join two tables at this time? For simplicity, I joined two tables directly.
drop table t_tmp;create table t_tmp(id int);insert into t_tmp (id) values(494613),(575087),(363588),(345980),...(1);select t9.* from t9, t_tmpwhere t9.c1 = t_tmp.id;
What is the time?
try psql study -f test_create_10000.sql -o /dev/nullLOG: duration: 2.078 msLOG: duration: 1.233 msLOG: duration: 224.112 msLOG: duration: 322.108 ms
Besides the drop and create time, it still takes more than 500 of the time. The premise here is the ssd disk I used, so the LOG writing time will be much faster. Why is it so slow? You can use the explain command to check whether the data volume is large at this time and the Merge join operation is used directly.
What is the efficiency of the 1000 rows of data?
try psql study -f test_create_1000.sql -o exp.outLOG: duration: 2.476 msLOG: duration: 0.967 msLOG: duration: 2.391 msLOG: duration: 8.780 ms
The data in Row 3 is as follows:
try psql study -f test_create_100.sql -o /dev/nullLOG: duration: 2.020 msLOG: duration: 1.028 msLOG: duration: 1.074 msLOG: duration: 1.912 ms
We can see that the create table method is no better than writing all the variables directly in 100 and 1000 values, and the NLJ is used in the explain statement. However, when the data volume is large (according to the original problem, the number of in statements is unpredictable), the efficiency will only be lower. in addition, the additional table maintenance costs and extra SQL statements will be added, DBAs certainly do not like it. Believe in databases and use in list to solve these problems with confidence.
The above content is aimed at the efficiency of select in postgresql, and I hope to help you!
Articles you may be interested in:
- Input + select (multiple) Implement the input values in the drop-down box
- Problems encountered when using select last_insert_id () in Spring
- Solve the option insertion BUG of the select label innerHTML in IE (compatible with IE, FF, Opera, Chrome, Safari)
- How to sort Mysql select in by id
- Parsing the use of insert into select in MySQL
- Introduction to the use and difference between insert into select and select
- Detailed description of precise timer using select in linux
- Solve the Problem of automatically executing select last_insert_id () after insert data in Hibernate JPA ()
- Detailed description of the differences between select into from and insert into select for database data insertion
- PostgreSQL tutorial (6): Explanation of functions and operators (2)
- PostgreSQL tutorial (7): Explanation of functions and operators (3)
- PostgreSQL tutorial (16): System View Details