Mr. Huang took an SQL statement and asked me if I could optimize it. After reading it, the data volume is not large and the query is not complex. The record optimization process is as follows:
DB: PostgreSQL 9.1
OS: CentOS 6
Count (d_personal_report_view )~ 9 K entries. The field type involved in the amount is numeric (9, 2 ).
Original SQL:
select distinct c.doctor_name,c.department_name,c.hospital_id,c.hospital_name,c.part_hospital_name,t.* from d_personal_report_view c,( select dc.part_hospital_id,dc.department_id,dc.doctor_id, coalesce(sum(coalesce(dc.surgery_amount, '0.0')), '0.0') as surgery_amount,sum(coalesce(dc.surgery_quantity, 0)) as surgery_quantity, coalesce(sum(coalesce(dc.repair_amount, '0.0')), '0.0') as repair_amount,sum(coalesce(dc.repair_quantity, 0)) as repair_quantity, coalesce(sum(coalesce(dc.orthod_amount, '0.0')), '0.0') as orthod_amount,sum(coalesce(dc.orthod_quantity, 0)) as orthod_quantity, coalesce(sum(coalesce(dc.radiation_amount, '0.0')), '0.0') as radiation_amount,sum(coalesce(dc.radiation_quantity, 0)) as radiation_quantity, coalesce(sum(coalesce(dc.teethcln_amount, '0.0')), '0.0') as teethcln_amount,sum(coalesce(dc.teethcln_quantity, 0)) as teethcln_quantity, coalesce(sum(coalesce(dc.crop_amount, '0.0')), '0.0') as crop_amount,sum(coalesce(dc.crop_quantity, 0)) as crop_quantity, coalesce(sum(coalesce(dc.assay_amount, '0.0')), '0.0') as assay_amount,sum(coalesce(dc.assay_quantity, 0)) as assay_quantity, coalesce(sum(coalesce(dc.drugs_amount, '0.0')), '0.0') as drugs_amount,sum(coalesce(dc.drugs_quantity, 0)) as drugs_quantity, coalesce(sum(coalesce(dc.regist_amount, '0.0')), '0.0') as regist_amount,sum(coalesce(dc.regist_quantity, 0)) as regist_quantity, coalesce(sum(coalesce(dc.others_amount, '0.0')), '0.0') as others_amount,sum(coalesce(dc.others_quantity, 0)) as others_quantity, coalesce(sum(surgery_amount + repair_amount + orthod_amount + radiation_amount + teethcln_amount + crop_amount + assay_amount + drugs_amount + regist_amount + others_amount), '0.0') as totalRowAmount, coalesce(sum(surgery_quantity + repair_quantity + orthod_quantity + radiation_quantity + teethcln_quantity + crop_quantity + assay_quantity + drugs_quantity + regist_quantity + others_quantity), 0) as totalRowQuantity from d_personal_report_view dcwhere 1=1 group by dc.part_hospital_id,dc.department_id,dc.doctor_id order by dc.part_hospital_id,dc.doctor_idlimit 10 offset 0) as t where c.part_hospital_id=t.part_hospital_id and c.department_id=t.department_id and c.doctor_id=t.doctor_id order by t.part_hospital_id,t.department_id,t.doctor_id;
I. Analysis
The results show that 10 different data records are used for report presentation. The same table is used for two associations without any query conditions, and the index is not very effective. Analyze this SQL statement first.
Ii. Optimization process
1. Cancel the association between the two tables and obtain the optimized SQL statement only once:
select dc.doctor_name,dc.department_name,dc.hospital_name,dc.part_hospital_name,dc.part_hospital_id,dc.department_id,dc.doctor_id, coalesce(sum(coalesce(dc.surgery_amount, '0.0')), '0.0') as surgery_amount,sum(coalesce(dc.surgery_quantity, 0)) as surgery_quantity, coalesce(sum(coalesce(dc.repair_amount, '0.0')), '0.0') as repair_amount,sum(coalesce(dc.repair_quantity, 0)) as repair_quantity, coalesce(sum(coalesce(dc.orthod_amount, '0.0')), '0.0') as orthod_amount,sum(coalesce(dc.orthod_quantity, 0)) as orthod_quantity, coalesce(sum(coalesce(dc.radiation_amount, '0.0')), '0.0') as radiation_amount,sum(coalesce(dc.radiation_quantity, 0)) as radiation_quantity, coalesce(sum(coalesce(dc.teethcln_amount, '0.0')), '0.0') as teethcln_amount,sum(coalesce(dc.teethcln_quantity, 0)) as teethcln_quantity, coalesce(sum(coalesce(dc.crop_amount, '0.0')), '0.0') as crop_amount,sum(coalesce(dc.crop_quantity, 0)) as crop_quantity, coalesce(sum(coalesce(dc.assay_amount, '0.0')), '0.0') as assay_amount,sum(coalesce(dc.assay_quantity, 0)) as assay_quantity, coalesce(sum(coalesce(dc.drugs_amount, '0.0')), '0.0') as drugs_amount,sum(coalesce(dc.drugs_quantity, 0)) as drugs_quantity, coalesce(sum(coalesce(dc.regist_amount, '0.0')), '0.0') as regist_amount,sum(coalesce(dc.regist_quantity, 0)) as regist_quantity, coalesce(sum(coalesce(dc.others_amount, '0.0')), '0.0') as others_amount,sum(coalesce(dc.others_quantity, 0)) as others_quantity, coalesce(sum(surgery_amount + repair_amount + orthod_amount + radiation_amount + teethcln_amount + crop_amount + assay_amount + drugs_amount + regist_amount + others_amount), '0.0') as totalRowAmount, coalesce(sum(surgery_quantity + repair_quantity + orthod_quantity + radiation_quantity + teethcln_quantity + crop_quantity + assay_quantity + drugs_quantity + regist_quantity + others_quantity), 0) as totalRowQuantity from d_personal_report_view dcwhere 1=1 group by dc.doctor_name,dc.department_name,dc.hospital_name,dc.part_hospital_name,dc.part_hospital_id,dc.department_id,dc.doctor_id order by dc.part_hospital_id,dc.doctor_idlimit 10 offset 0
View the execution plan:
2. the cost in the execution plan is still a little high. When sum in postgresql is optimized again, the null value is automatically replaced with 0, and the null values such as ''or'' are not saved, therefore, you can remove the coalesce conversion function.
Optimized SQL:
select dc.doctor_name,dc.department_name,dc.hospital_name,dc.part_hospital_name,dc.part_hospital_id,dc.department_id,dc.doctor_id,sum(dc.surgery_amount) as surgery_amount,sum(dc.surgery_quantity) as surgery_quantity,sum(dc.repair_amount) as repair_amount,sum(dc.repair_quantity) as repair_quantity,sum(dc.orthod_amount) as orthod_amount,sum(dc.orthod_quantity) as orthod_quantity,sum(dc.radiation_amount) as radiation_amount,sum(dc.radiation_quantity) as radiation_quantity,sum(dc.teethcln_amount) as teethcln_amount,sum(dc.teethcln_quantity) as teethcln_quantity,sum(dc.crop_amount) as crop_amount,sum(dc.crop_quantity) as crop_quantity,sum(dc.assay_amount) as assay_amount,sum(dc.assay_quantity) as assay_quantity,sum(dc.drugs_amount) as drugs_amount,sum(dc.drugs_quantity) as drugs_quantity,sum(dc.regist_amount) as regist_amount,sum(dc.regist_quantity) as regist_quantity,sum(dc.others_amount) as others_amount,sum(dc.others_quantity) as others_quantity,sum(surgery_amount + repair_amount + orthod_amount + radiation_amount + teethcln_amount + crop_amount + assay_amount + drugs_amount + regist_amount + others_amount) as totalRowAmount,sum(surgery_quantity + repair_quantity + orthod_quantity + radiation_quantity + teethcln_quantity + crop_quantity + assay_quantity + drugs_quantity + regist_quantity + others_quantity) as totalRowQuantity from d_personal_report_view dcwhere 1=1 group by dc.doctor_name,dc.department_name,dc.hospital_name,dc.part_hospital_name,dc.part_hospital_id,dc.department_id,dc.doctor_id order by dc.part_hospital_id,dc.doctor_idlimit 10 offset 0
View the execution plan:
The results are the same. It also shows that PostgreSQL's Internal Conversion Function consumes a lot.
3. to narrow down the query base and improve the query performance, all the query statements before the full table scan the basic table and sort and group by rules, and finally obtain 10 results, in fact, a certain amount of data can be obtained from another perspective, without the need for a full table, but it can generally meet the different data results, and then take 10 more.
Optimized SQL
select dc.doctor_name,dc.department_name,dc.hospital_name,dc.part_hospital_name,dc.part_hospital_id,dc.department_id,dc.doctor_id,sum(dc.surgery_amount) as surgery_amount,sum(dc.surgery_quantity) as surgery_quantity,sum(dc.repair_amount) as repair_amount,sum(dc.repair_quantity) as repair_quantity,sum(dc.orthod_amount) as orthod_amount,sum(dc.orthod_quantity) as orthod_quantity,sum(dc.radiation_amount) as radiation_amount,sum(dc.radiation_quantity) as radiation_quantity,sum(dc.teethcln_amount) as teethcln_amount,sum(dc.teethcln_quantity) as teethcln_quantity,sum(dc.crop_amount) as crop_amount,sum(dc.crop_quantity) as crop_quantity,sum(dc.assay_amount) as assay_amount,sum(dc.assay_quantity) as assay_quantity,sum(dc.drugs_amount) as drugs_amount,sum(dc.drugs_quantity) as drugs_quantity,sum(dc.regist_amount) as regist_amount,sum(dc.regist_quantity) as regist_quantity,sum(dc.others_amount) as others_amount,sum(dc.others_quantity) as others_quantity,sum(surgery_amount + repair_amount + orthod_amount + radiation_amount + teethcln_amount + crop_amount + assay_amount + drugs_amount + regist_amount + others_amount) as totalRowAmount,sum(surgery_quantity + repair_quantity + orthod_quantity + radiation_quantity + teethcln_quantity + crop_quantity + assay_quantity + drugs_quantity + regist_quantity + others_quantity) as totalRowQuantity from (select * from d_personal_report_view limit 1000) as dcgroup by dc.doctor_name,dc.department_name,dc.hospital_name,dc.part_hospital_name,dc.part_hospital_id,dc.department_id,dc.doctor_id order by dc.part_hospital_id,dc.doctor_idlimit 10 offset 0
Now let's take a look at the execution plan: the effect has been significantly improved, especially when the basic table expands rapidly, the response time is much faster.
This method is also applicable to other databases, but must match the business results.