Associated computing for improving performance using rundry computing reports

Last Update:2014-10-20 Source: Internet

Author: User

Tags intel core i5

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Data Association computing is often performed in reports during report development. To reduce the complexity of report preparation, you can place the association relationship in a visual report template, such as multiple data sources and heterogeneous data sources. Association in reports often results in low report efficiency, slow computing, and performance problems. A special data association method is provided to improve the report performance. Here we use a common multi-source associated sharded report instance to view the implementation process of the next set computing report:

Report description

The sales information table summarizes sales by time, region, sales personnel, products, and other dimensions. The report format is as follows:

Write computing scripts

First, write a script using the set calculator to complete data association and return the associated result set for the report:

A1: connect to the data source;

A2-A5: Execute SQL to take orders, products and other table data;

A6-A8: Use the switch to associate the multi-table data, and the association results are stored in the A2 lattice;

A9: Create a new sequence table based on the associated results. The result set is returned for the report through A10.

Prepare reports

After creating a report template in the Set Computing report designer, select "Set Computing" for the dataset. In the dataset editing window, specify the preceding DFX file to complete the dataset creation.

Set the report template expression:

Different from association in reports, in a report template, a set computing report directly creates a group report based on a result set returned by the Set Computing script, thus achieving higher performance, the following describes how to associate a report:

Association implementation in reports

Dataset

Ds1: Select customer. region, customer. city, order details. quantity, order details. discount, order details. unit price, order. employee ID, order. order Date, order details. product ID from order details, orders, customers where customers. customer ID = order. customer ID and order. order id = order details. order ID and order. the order date is not null.

DS2: Select category. Category ID, category. category name from category.

DS3: Select * from employee.

DS4: Select Product. Category ID, product. Product ID from product.

Report Template

Comparison results

In this example, the data volume of the source table is more than 0.4 million, and the same number of SQL statements are used, the following table compares the running time of the report Presentation by using the set computing report test 1 in the report Association 2 using the set computing script Association and passing the result to the report:

You can see the advantages of a set computing report in processing associated computing reports. Since a report can only be associated with a report, you can only use the traversal algorithm (search for associated sub-records for a single primary record ), therefore, the efficiency is not high. The set operator adopts a more efficient hash Association Scheme (all subrecords can be hash to the master record according to the corresponding code in advance, and the switch function in the Code uses the hash Association technology, the single-Computing Association time can be 5-10 times faster). Therefore, the performance is improved by more than doubled after the dataset is associated.

In addition, the cube is also very suitable for processing data associations between heterogeneous data sources, such as common multi-database, file, and database hybrid situations.

The following is the running log and test machine configuration.

[Appendix 1] Association in the running log report

[11:32:59]: [info]-start computing report, first take the number ......

[11:32:59]: [debug]-start the SQL statement below

[11:32:59]: [debug]-ds1 = select customer. region, customer. city, order details. quantity, order details. discount, order details. unit price, order. employee ID, order. order Date, order details. product ID from order details, orders, customers where customers. customer ID = order. customer ID and order. order id = order details. order ID and order. order date is not null

[11:33:35]: [debug]-start the SQL statement below

[11:33:35]: [debug]-ds2 = select category. Category ID, category. category name from category

[11:33:35]: [debug]-start the SQL statement below

[11:33:35]: [debug]-DS3 = select * from employee

[11:33:35]: [debug]-start the SQL statement below

[11:33:35]: [debug]-DS4 = select product. Category ID, product. Product ID from product

[11:33:35]: [info]-end of number fetch and start Operation

[11:34:58]: [info]-computing ended:

Association in DFX

[11:56:33]: [info]-start computing report, first take the number ......

[11:57:11]: [info]-end of number fetch and start Operation

[11:57:26]: [info]-computing ended:

[Appendix 2] test machine configuration

Test Model: Dell target 3420

CPU: Intel Core i5-3210M @ 2.50 GHz * 4

Ram: 4G

HDD: West digital WDC (500g 5400 rpm)

Operating System: Win7 (x64) SP1

JDK1.6

Database:Oracle11g r2

Report version5.0

Associated computing for improving performance using rundry computing reports

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More