A parallel multi-library to improve performance of run-dry set report

Last Update:2015-01-08 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

when you apply a large amount of data, the report performance is often not high, at this point for the source data volume of the report the optimizations on SQL or report side are often not obvious. If the data in a certain rules (such as time) sub-library segmented storage, report access to the same time access to multiple databases for data calculation, and finally summarized in the report, the use of this parallel multi-library approach to improve report performance.

The General reporting tool does not have the ability to summarize parallel fetches, and accessing multiple databases to read segmented data requires High-level languages such as Java are done, however , it is not easy to write such parallel programs in Java, and because Java lacks basic support for bulk data calculations, expression parameters and dynamic Data structures are not supported. Makes it difficult for general reporting tools to directly use parallel multi-Libraries to improve report performance.

The run-dry collection report includes a built-in compute engine that provides parallel computing capabilities that allow users to read data from multiple databases simultaneously and summarize them on the report side, improving report performance.

The specific implementation can refer to the relevant documentation of the collection report, here is a simple example of the use of parallel multi-Library (in MySQL, for example).

a telecommunication enterprise will use the information of User Service according to the statistic area storage (4 mysql database), the Service usage information Statistic report needs to filter the query according to the specified time period, brand and other conditions, summarize the data. The steps to use a collection report for parallel library queries are as follows:

1. use the built-in set-up of the set report to write parallel scripts that enable you to summarize the results after fetching from multiple databases.

Parallel scripts:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/58/49/wKioL1St-cODAcY4AAIyDxPsSZs680.jpg "title=" Report5_performance_paralleldatabase_1.jpg "alt=" Wkiol1st-codacy4aaiydxpsszs680.jpg "/>

The above script starts 4 threads at the same time count from 4 database, and finally merges the results into a summary. The specific meanings are as follows:

A1: 4 Data source names are specified, each parallel thread connects to a different database;

A2: Executes the code block in this grid using multithreading, where 4 child threads are started;

B3: Each thread is connected to its own data source separately;

B4: Emits SQL execution to the specified data source to summarize within the database and retrieve the results. At this time, 4 databases will execute their own SQL statements separately;

B5: Close the database connection;

B6: Returns the result of the child thread run, the child thread ends;

A7: Merge The results returned by the child thread;

A8: Summarize the merged results again;

A9: Returns a result set for the report.

There are two key points for completing parallel multi-Library operations. One is the ability to make multiple databases work in parallel (section 2-6 ), which requires a simple parallel programming mechanism from the reporting engine. The second is to be able to summarize the results of parallel computing ( line 7,8), because the results of the various sub-Libraries may have duplicate data also need to be summarized again, which requires the report engine has strong bulk data recalculation ability.

2. call the above set-up script in the collection report, and edit the report expression to complete the report production.

Parallel multi-Library is applicable to the source data volume is large, and the amount of data is not removed, the data will be stored after the report statistics, the above example shows that each database to establish a connection fetch, the use of parallel programs can also establish multiple connections simultaneously query, through this way to improve the efficiency of the report query, improve report performance.

This article is from the High performance report data calculation blog, so be sure to keep this source http://report5.blog.51cto.com/8028595/1600587

A parallel multi-library to improve performance of run-dry set report

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A parallel multi-library to improve performance of run-dry set report

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A parallel multi-library to improve performance of run-dry set report

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support