A parallel multi-library to improve performance of run-dry set report

Source: Internet
Author: User

when you apply a large amount of data, the report performance is often not high, at this point for the source data volume of the report the optimizations on SQL or report side are often not obvious. If the data in a certain rules (such as time) sub-library segmented storage, report access to the same time access to multiple databases for data calculation, and finally summarized in the report, the use of this parallel multi-library approach to improve report performance.

The General reporting tool does not have the ability to summarize parallel fetches, and accessing multiple databases to read segmented data requires High-level languages such as Java are done, however , it is not easy to write such parallel programs in Java, and because Java lacks basic support for bulk data calculations, expression parameters and dynamic Data structures are not supported. Makes it difficult for general reporting tools to directly use parallel multi-Libraries to improve report performance.

The run-dry collection report includes a built-in compute engine that provides parallel computing capabilities that allow users to read data from multiple databases simultaneously and summarize them on the report side, improving report performance.

The specific implementation can refer to the relevant documentation of the collection report, here is a simple example of the use of parallel multi-Library (in MySQL, for example).

a telecommunication enterprise will use the information of User Service according to the statistic area storage (4 mysql database), the Service usage information Statistic report needs to filter the query according to the specified time period, brand and other conditions, summarize the data. The steps to use a collection report for parallel library queries are as follows:

1. use the built-in set-up of the set report to write parallel scripts that enable you to summarize the results after fetching from multiple databases.

Parallel scripts:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/58/49/wKioL1St-cODAcY4AAIyDxPsSZs680.jpg "title=" Report5_performance_paralleldatabase_1.jpg "alt=" Wkiol1st-codacy4aaiydxpsszs680.jpg "/>

The above script starts 4 threads at the same time count from 4 database, and finally merges the results into a summary. The specific meanings are as follows:

A1: 4 Data source names are specified, each parallel thread connects to a different database;

A2: Executes the code block in this grid using multithreading, where 4 child threads are started;

B3: Each thread is connected to its own data source separately;

B4: Emits SQL execution to the specified data source to summarize within the database and retrieve the results. At this time, 4 databases will execute their own SQL statements separately;

B5: Close the database connection;

B6: Returns the result of the child thread run, the child thread ends;

A7: Merge The results returned by the child thread;

A8: Summarize the merged results again;

A9: Returns a result set for the report.

There are two key points for completing parallel multi-Library operations. One is the ability to make multiple databases work in parallel (section 2-6 ), which requires a simple parallel programming mechanism from the reporting engine. The second is to be able to summarize the results of parallel computing ( line 7,8), because the results of the various sub-Libraries may have duplicate data also need to be summarized again, which requires the report engine has strong bulk data recalculation ability.

2. call the above set-up script in the collection report, and edit the report expression to complete the report production.

Parallel multi-Library is applicable to the source data volume is large, and the amount of data is not removed, the data will be stored after the report statistics, the above example shows that each database to establish a connection fetch, the use of parallel programs can also establish multiple connections simultaneously query, through this way to improve the efficiency of the report query, improve report performance.


This article is from the High performance report data calculation blog, so be sure to keep this source http://report5.blog.51cto.com/8028595/1600587

A parallel multi-library to improve performance of run-dry set report

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.