Data Association computing is often performed in reports during report development. To reduce the complexity of report preparation, you can place the association relationship in a visual report template, such as multiple data sources and heterogeneous data sources. Association in reports often results in low report efficiency, slow computing, and performance problems. A special data association method is provided to improve the report performance. Here we use a common multi-source associated sharded report instance to view the implementation process of the next set computing report:
Report description
The sales information table summarizes sales by time, region, sales personnel, products, and other dimensions. The report format is as follows:
Write computing scripts
First, write a script using the set calculator to complete data association and return the associated result set for the report:
A1: connect to the data source;
A2-A5: Execute SQL to take orders, products and other table data;
A6-A8: Use the switch to associate the multi-table data, and the association results are stored in the A2 lattice;
A9: Create a new sequence table based on the associated results. The result set is returned for the report through A10.
Prepare reports
After creating a report template in the Set Computing report designer, select "Set Computing" for the dataset. In the dataset editing window, specify the preceding DFX file to complete the dataset creation.
Set the report template expression:
Different from association in reports, in a report template, a set computing report directly creates a group report based on a result set returned by the Set Computing script, thus achieving higher performance, the following describes how to associate a report:
Association implementation in reports
Dataset
Ds1: Select customer. region, customer. city, order details. quantity, order details. discount, order details. unit price, order. employee ID, order. order Date, order details. product ID from order details, orders, customers where customers. customer ID = order. customer ID and order. order id = order details. order ID and order. the order date is not null.
DS2: Select category. Category ID, category. category name from category.
DS3: Select * from employee.
DS4: Select Product. Category ID, product. Product ID from product.
Report Template
Comparison results
In this example, the data volume of the source table is more than 0.4 million, and the same number of SQL statements are used, the following table compares the running time of the report Presentation by using the set computing report test 1 in the report Association 2 using the set computing script Association and passing the result to the report:
You can see the advantages of a set computing report in processing associated computing reports. Since a report can only be associated with a report, you can only use the traversal algorithm (search for associated sub-records for a single primary record ), therefore, the efficiency is not high. The set operator adopts a more efficient hash Association Scheme (all subrecords can be hash to the master record according to the corresponding code in advance, and the switch function in the Code uses the hash Association technology, the single-Computing Association time can be 5-10 times faster). Therefore, the performance is improved by more than doubled after the dataset is associated.
In addition, the cube is also very suitable for processing data associations between heterogeneous data sources, such as common multi-database, file, and database hybrid situations.
The following is the running log and test machine configuration.
[Appendix 1] Association in the running log report
[11:32:59]: [info]-start computing report, first take the number ......
[11:32:59]: [debug]-start the SQL statement below
[11:32:59]: [debug]-ds1 = select customer. region, customer. city, order details. quantity, order details. discount, order details. unit price, order. employee ID, order. order Date, order details. product ID from order details, orders, customers where customers. customer ID = order. customer ID and order. order id = order details. order ID and order. order date is not null
[11:33:35]: [debug]-start the SQL statement below
[11:33:35]: [debug]-ds2 = select category. Category ID, category. category name from category
[11:33:35]: [debug]-start the SQL statement below
[11:33:35]: [debug]-DS3 = select * from employee
[11:33:35]: [debug]-start the SQL statement below
[11:33:35]: [debug]-DS4 = select product. Category ID, product. Product ID from product
[11:33:35]: [info]-end of number fetch and start Operation
[11:34:58]: [info]-computing ended:
Association in DFX
[11:56:33]: [info]-start computing report, first take the number ......
[11:57:11]: [info]-end of number fetch and start Operation
[11:57:26]: [info]-computing ended:
[Appendix 2] test machine configuration
Test Model: Dell target 3420
CPU: Intel Core i5-3210M @ 2.50 GHz * 4
Ram: 4G
HDD: West digital WDC (500g 5400 rpm)
Operating System: Win7 (x64) SP1
JDK1.6
Database:Oracle11g r2
Report version5.0
Associated computing for improving performance using rundry computing reports