when you apply a large amount of data, the report performance is often not high, at this point for the source data volume of the report the optimizations on SQL or report side are often not obvious. If the data in a certain rules (such as time) sub-library segmented storage, report access to the same time access to multiple databases for data calculation, and finally summarized in the report, the use of this parallel multi-library approach to improve report performance.
The General reporting tool does not have the ability to summarize parallel fetches, and accessing multiple databases to read segmented data requires High-level languages such as Java are done, however , it is not easy to write such parallel programs in Java, and because Java lacks basic support for bulk data calculations, expression parameters and dynamic Data structures are not supported. Makes it difficult for general reporting tools to directly use parallel multi-Libraries to improve report performance.
The run-dry collection report includes a built-in compute engine that provides parallel computing capabilities that allow users to read data from multiple databases simultaneously and summarize them on the report side, improving report performance.
The specific implementation can refer to the relevant documentation of the collection report, here is a simple example of the use of parallel multi-Library (in MySQL, for example).
a telecommunication enterprise will use the information of User Service according to the statistic area storage (4 mysql database), the Service usage information Statistic report needs to filter the query according to the specified time period, brand and other conditions, summarize the data. The steps to use a collection report for parallel library queries are as follows:
1. use the built-in set-up of the set report to write parallel scripts that enable you to summarize the results after fetching from multiple databases.
Parallel scripts:
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/58/49/wKioL1St-cODAcY4AAIyDxPsSZs680.jpg "title=" Report5_performance_paralleldatabase_1.jpg "alt=" Wkiol1st-codacy4aaiydxpsszs680.jpg "/>
The above script starts 4 threads at the same time count from 4 database, and finally merges the results into a summary. The specific meanings are as follows:
A1: 4 Data source names are specified, each parallel thread connects to a different database;
A2: Executes the code block in this grid using multithreading, where 4 child threads are started;
B3: Each thread is connected to its own data source separately;
B4: Emits SQL execution to the specified data source to summarize within the database and retrieve the results. At this time, 4 databases will execute their own SQL statements separately;
B5: Close the database connection;
B6: Returns the result of the child thread run, the child thread ends;
A7: Merge The results returned by the child thread;
A8: Summarize the merged results again;
A9: Returns a result set for the report.
There are two key points for completing parallel multi-Library operations. One is the ability to make multiple databases work in parallel (section 2-6 ), which requires a simple parallel programming mechanism from the reporting engine. The second is to be able to summarize the results of parallel computing ( line 7,8), because the results of the various sub-Libraries may have duplicate data also need to be summarized again, which requires the report engine has strong bulk data recalculation ability.
2. call the above set-up script in the collection report, and edit the report expression to complete the report production.
Parallel multi-Library is applicable to the source data volume is large, and the amount of data is not removed, the data will be stored after the report statistics, the above example shows that each database to establish a connection fetch, the use of parallel programs can also establish multiple connections simultaneously query, through this way to improve the efficiency of the report query, improve report performance.
This article is from the High performance report data calculation blog, so be sure to keep this source http://report5.blog.51cto.com/8028595/1600587
A parallel multi-library to improve performance of run-dry set report