After the first two rounds of optimization, Saiku from the non-use, optimized to use, but in the analysis of a large number of log data, but also the feeling of the card! Continue observing the SQL behind the execution and decide to focus on the index!
The main use of the log is: fixed date dimension Data analysis, that is, where conditions must follow the date equals a day, then the tangle is: Each field is indexed, or the date to establish a federated index. It boils down to the index efficiency of a single field versus the efficiency of a federated index!
PostgreSQL Data Sheet: Saiku_search_detail
Table structure:
CREATE TABLETest.saiku_search_detail (rpt_date date, from_area_idbigint, from_value_idbigint, in_track_idbigint, GIDcharacter varying, current_city_idbigint, dist_city_idbigint, category_name_idbigint, page_idbigint, utmr_page_idbigint, Numbigint, idbigint, partnersmallint)
Number of articles: 8,510,490. About 8.51 million.
Test steps: First, bare table
To query for a date:
1.1 Single condition
Select Count (1) from Test.saiku_search_detail where = ' 2016-05-13 '
Results: 1110ms
"Aggregate (Cost=160934.85..160934.86Rows=1Width=0)"" -Seq Scan onSaiku_search_detail (Cost=0.00..160816.78Rows=47230Width=0) "" Filter: (rpt_date= '2016-05-13'::d ate) "
1.2 of two conditions
Select Count (1) from Test.saiku_search_detail where = ' 2016-05-13 ' and = 135
Results: 1782ms
"Aggregate (Cost=184432.32..184432.33Rows=1Width=0)"" -Seq Scan onSaiku_search_detail (Cost=0.00..184431.73Rows=236Width=0) "" Filter: ((rpt_date= '2016-05-13'::d ate) and(from_area_id= 135))"
No objection, 0 index!
Second, add an index to two fields, respectively:
-- Btree Index CREATE INDEX Saiku_search_detail_from_area_id_idx on Saiku_search_detail USING btree (from_area_id); -- Hash Index CREATE INDEX Saiku_search_detail_rpt_date_idx on Saiku_search_detail USING Hash (rpt_date);
2.1 Single condition
Select Count (1) from Saiku_search_detail where = ' 2016-05-13 '
Results: 83MS
"Aggregate (Cost=8.02..8.03Rows=1Width=0)"" - IndexScan using Saiku_search_detail_rpt_date_idx onSaiku_search_detail (Cost=0.00..8.02Rows=1Width=0)"" IndexCond: (rpt_date= '2016-05-13'::d ate) "
Using an index
2.2 of two conditions
Select Count (1) from Saiku_search_detail where = ' 2016-05-13 ' and = 135
Results: 149ms
"Aggregate (Cost=8.02..8.03Rows=1Width=0)"" - IndexScan using Saiku_search_detail_rpt_date_idx onSaiku_search_detail (Cost=0.00..8.02Rows=1Width=0)"" IndexCond: (rpt_date= '2016-05-13'::d ate) "" Filter: (from_area_id= 135)"
An index was used and the second index did not take effect. Try to modify the condition Order of sql:
Select Count (1) from Saiku_search_detail where = 135 and = ' 2016-05-13 '
The same result! This means that in PostgreSQL, two indexed fields are created, only one will work!
Iii. establishment of a joint index
-- composite index, two fields are added to an index CREATE INDEX Saiku_search_detail_rpt_date_from_area_idx on Test.saiku_search_detail USING btree (rpt_date, from_area_id);
3.1 Single criteria query & first field of index
Select Count (1) from Test.saiku_search_detail where = ' 2016-05-13 '
Results: 66ms
"Aggregate (Cost=47843.00..47843.01Rows=1Width=0)"" -Bitmap Heap Scan onSaiku_search_detail (Cost=2220.63..47362.94Rows=192025Width=0) "" Recheck Cond: (Rpt_date= '2016-05-13'::d ate) "" -BitmapIndexScan onSaiku_search_detail_rpt_date_from_area_idx (Cost=0.00..2172.62Rows=192025Width=0)"
Visible partial indexes are used
3.2 Two conditional queries
Select Count (1) from Test.saiku_search_detail where = ' 2016-05-13 ' and = 135
Results: 65ms
"Aggregate (Cost=46124.99..46125.00Rows=1Width=0)"" -Bitmap Heap Scan onSaiku_search_detail (Cost=1509.67..45857.37Rows=107047Width=0) "" Recheck Cond: ((rpt_date= '2016-05-13'::d ate) and(from_area_id= 135))"" -BitmapIndexScan onSaiku_search_detail_rpt_date_from_area_idx (Cost=0.00..1482.90Rows=107047Width=0)"
Using an index
Summarize
- Nonsense: If two fields are filtered, then the federated index is optimal.
- Benefit: During the log parsing process, the individual field indexes are not functional except for the date, and should be deleted
- Tangle: Just create a single index on a date, or build multiple composite indexes that contain dates? Make your own decisions based on the usage scenario.
Saiku speed Optimization (iii)