Saiku speed Optimization (iii)

Source: Internet
Author: User

After the first two rounds of optimization, Saiku from the non-use, optimized to use, but in the analysis of a large number of log data, but also the feeling of the card! Continue observing the SQL behind the execution and decide to focus on the index!

The main use of the log is: fixed date dimension Data analysis, that is, where conditions must follow the date equals a day, then the tangle is: Each field is indexed, or the date to establish a federated index. It boils down to the index efficiency of a single field versus the efficiency of a federated index!

PostgreSQL Data Sheet: Saiku_search_detail

Table structure:

CREATE TABLETest.saiku_search_detail (rpt_date date, from_area_idbigint, from_value_idbigint, in_track_idbigint, GIDcharacter varying, current_city_idbigint, dist_city_idbigint, category_name_idbigint, page_idbigint, utmr_page_idbigint, Numbigint, idbigint, partnersmallint)

Number of articles: 8,510,490. About 8.51 million.

Test steps: First, bare table

To query for a date:

1.1 Single condition

Select  Count (1)  from Test.saiku_search_detail where = ' 2016-05-13 '

Results: 1110ms

"Aggregate (Cost=160934.85..160934.86Rows=1Width=0)""   -Seq Scan onSaiku_search_detail (Cost=0.00..160816.78Rows=47230Width=0) "" Filter: (rpt_date= '2016-05-13'::d ate) "

1.2 of two conditions

Select  Count (1)  from Test.saiku_search_detail where = ' 2016-05-13 '  and = 135

Results: 1782ms

"Aggregate (Cost=184432.32..184432.33Rows=1Width=0)""   -Seq Scan onSaiku_search_detail (Cost=0.00..184431.73Rows=236Width=0) "" Filter: ((rpt_date= '2016-05-13'::d ate) and(from_area_id= 135))"

No objection, 0 index!

Second, add an index to two fields, respectively:
-- Btree Index CREATE INDEX Saiku_search_detail_from_area_id_idx    on Saiku_search_detail  USING btree  (from_area_id); -- Hash Index CREATE INDEX Saiku_search_detail_rpt_date_idx    on Saiku_search_detail  USING Hash  (rpt_date);

2.1 Single condition

Select  Count (1)  from Saiku_search_detail where = ' 2016-05-13 '

Results: 83MS

"Aggregate (Cost=8.02..8.03Rows=1Width=0)""   -  IndexScan using Saiku_search_detail_rpt_date_idx onSaiku_search_detail (Cost=0.00..8.02Rows=1Width=0)""        IndexCond: (rpt_date= '2016-05-13'::d ate) "

Using an index

2.2 of two conditions

Select  Count (1)  from Saiku_search_detail where = ' 2016-05-13 '  and = 135

Results: 149ms

"Aggregate (Cost=8.02..8.03Rows=1Width=0)""   -  IndexScan using Saiku_search_detail_rpt_date_idx onSaiku_search_detail (Cost=0.00..8.02Rows=1Width=0)""        IndexCond: (rpt_date= '2016-05-13'::d ate) "" Filter: (from_area_id= 135)"

An index was used and the second index did not take effect. Try to modify the condition Order of sql:

Select  Count (1)  from Saiku_search_detail where = 135  and = ' 2016-05-13 '

The same result! This means that in PostgreSQL, two indexed fields are created, only one will work!

Iii. establishment of a joint index
-- composite index, two fields are added to an index CREATE INDEX Saiku_search_detail_rpt_date_from_area_idx    on Test.saiku_search_detail  USING btree  (rpt_date, from_area_id);  

3.1 Single criteria query & first field of index

Select  Count (1)  from Test.saiku_search_detail where = ' 2016-05-13 '

Results: 66ms

"Aggregate (Cost=47843.00..47843.01Rows=1Width=0)""   -Bitmap Heap Scan onSaiku_search_detail (Cost=2220.63..47362.94Rows=192025Width=0) "" Recheck Cond: (Rpt_date= '2016-05-13'::d ate) "" -BitmapIndexScan onSaiku_search_detail_rpt_date_from_area_idx (Cost=0.00..2172.62Rows=192025Width=0)"

Visible partial indexes are used

3.2 Two conditional queries

Select  Count (1)  from Test.saiku_search_detail where = ' 2016-05-13 '  and = 135

Results: 65ms

"Aggregate (Cost=46124.99..46125.00Rows=1Width=0)""   -Bitmap Heap Scan onSaiku_search_detail (Cost=1509.67..45857.37Rows=107047Width=0) "" Recheck Cond: ((rpt_date= '2016-05-13'::d ate) and(from_area_id= 135))""         -BitmapIndexScan onSaiku_search_detail_rpt_date_from_area_idx (Cost=0.00..1482.90Rows=107047Width=0)"

Using an index

Summarize
    • Nonsense: If two fields are filtered, then the federated index is optimal.
    • Benefit: During the log parsing process, the individual field indexes are not functional except for the date, and should be deleted
    • Tangle: Just create a single index on a date, or build multiple composite indexes that contain dates? Make your own decisions based on the usage scenario.

Saiku speed Optimization (iii)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.