The parallel fetching of performance of run-dry set calculation report

Last Update:2014-12-30 Source: Internet

Author: User

Tags intel core i5

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

sometimes a report needs to fetch more data from the database when it is presented or exported, and the speed of JDBC is always slow, it is possible to exceed the time-consuming of other operations in the report, resulting in a very low efficiency of the entire report generation process. Below we see how to improve the performance of JDBC by using the parallel computer system of the run-dry aggregate report .

The so-called parallel fetch refers to the use of multi-threading technology in the Report tool to establish multiple connections with the database, while reading a copy of the source data, which need to fragment the source data, each thread (database connection) read one of the content, and finally the results of all the threads to merge the total target data process.

the built-in parallelism mechanism makes it easy to perform parallel fetch tasks to improve report performance, where Oracle illustrates the implementation process.

Report Description

The User Status table shows detail data, which must be read all at once, due to the need to export. The amount of data in single table is $ million, and the report style is:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/57/AE/wKioL1SiVETxLPr0AAA5fJBk9Oo863.jpg "style=" float: none; "title=" report_improve_performance_parallel_1.jpg "alt=" Wkiol1sivetxlpr0aaa5fjbk9oo863.jpg "/>

Writing scripts

Use the Collector script (parallel.dfx) to implement parallel fetch logic. First determine the Segment field, where the numeric userid field is selected , the number of threads to set and the maximum userid value (specified as a large numeric constant) to calculate the range of the userid in each paragraph, and finally, based on the specified data range SQL fragment reads the data and completes the multi-threaded fetch process. The following is the implementation script:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/57/B1/wKiom1SiU5ChxFXRAAGPz-ZUALQ335.jpg "style=" float: none; "title=" report_improve_performance_parallel_2.jpg "alt=" Wkiom1siu5chxfxraagpz-zualq335.jpg "/>

A5 -fork uses multi-threaded execution of the code block in the grid to achieve parallel fetching;

B6-b9 completes the parallel fetch number, each thread runs the result to return the A8 lattice;

A10 returns the merged result set for the report.

Preparing reports

After you create a new report template, the dataset selects the " collector ", which specifies the edited DFX filein the DataSet editing window to complete the dataset creation.

To set a report template expression:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/57/AE/wKioL1SiVETj0bABAABtOdvOsKg398.jpg "style=" float: none; "title=" report_improve_performance_parallel_3.jpg "alt=" Wkiol1sivetj0babaabtodvoskg398.jpg "/>

A2: Use the Select function to take a list of user login information According to the data set DS1;

A2-G2: According to the A2 extension, the user ID, account, and online information are taken by the value expression .

Parallel Post Effects

after the report is run, the data set calculation time is recorded, and the results are compared before and after the parallel operation:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/57/B1/wKiom1SiU5CjB6jgAABJ1JILVGU722.jpg "style=" float: none; "title=" report_improve_performance_parallel_4.jpg "alt=" Wkiom1siu5cjb6jgaabj1jilvgu722.jpg "/>

The above see the effect of parallel fetching, through the parallel mechanism of the collection report can accelerate the process of fetching, thereby improving the overall report performance, test results vary depending on the test environment, the native configuration is described in the appendix.

Using multi-threaded parallel fetching is a good way to make full use of database resources in situations where database resources are relatively idle, such as when the number of connections is not up to the line. If the database task is already saturated, this approach can further aggravate the database burden without increasing the speed.

In addition, the selection of segmented fields has a greater impact on performance. It is preferable to choose an indexed field that is inserted incrementally, such as a serial number as the primary key, so that the database can take full advantage of the index to reduce the traversal range, and to select fields of numeric type, which are more efficient and easy to split. In the actual operation to consider these two factors, in this case, because there is no index and primary key, so selected a numeric userid. In real business, it is often possible to find indexed numeric fields that are incrementally inserted, and the performance gains obtained by parallel fetching are also better.

"attached" test machine configuration

Test Models : Dell Inspiron 3420

CPU : Intel Core i5-3210m @2.50ghz

RAM :4G

HDD : West number WDC (500G 5400 ext ./ min )

Operating System : Win7 (X64) SP1

JDK :1.6

database:oracle11g R2

Collection Report Version :5.0

This article is from the High performance report data calculation blog, so be sure to keep this source http://report5.blog.51cto.com/8028595/1597798

The parallel fetching of performance of run-dry set calculation report

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More