Data Association computing is often performed in reports during report development. To reduce the complexity of report preparation, you can place the association relationship in a visual report template, such as multiple data sources and heterogeneous data sources. Association in reports often results in low report efficiency, slow computing, and performance problems. A special data association method is provided to improve the report performance. Here we use a common multi-source associated sharded report instance to view the implementation process of the next set computing report:
Report description
The sales information table summarizes sales by time, region, sales personnel, products, and other dimensions. The report format is as follows:
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M00/49/E4/wKiom1QeRZCBC6iTAAFLB1kRvW8993.jpg "style =" float: none; "Title =" 2014-09-21_0000608.jpg "alt =" wkiom1qerzcbc6itaaflb1krvw8993.jpg "/>
The implementation process is as follows.
Write computing scripts
First, use the set calculator to write a script to complete data association and return the associated result set for the report.
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M00/49/E4/wKiom1QeRZKxrLuCAALtnSzZ_Mw053.jpg "Title =" 2014-09-21_112618.jpg "style =" float: none; "alt =" wkiom1qerzkxrlucaaltnszz_mw053.jpg "/>
A1 connects to the data source;
The A2-A5 executes the SQL to take the order, the product and so on table data;
The A6-A8 uses the switch to associate the multi-table data, and the association results are stored in the A2 lattice;
A9 creates a new sequence table based on the associated results, and the result set is returned for the report through A10.
Prepare reports
After creating a report template in the Set Computing report designer, select "Set Computing" for the dataset. In the dataset editing window, specify the preceding DFX file to complete the dataset creation.
Set the report template expression:
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M01/49/E6/wKioL1QeRbDwmDJOAADBEwGS6Sg686.jpg "Title =" 2014-09-21_112626.jpg "style =" float: none; "alt =" wkiol1qerbdwmdjoaadbewgs6sg686.jpg "/>
Different from association in reports, in a report template, a set computing report directly creates a group report based on a result set returned by the Set Computing script, thus achieving higher performance, the following describes how to associate a report:
Association implementation in reports
Dataset
Ds1: Select customer. region, customer. city, order details. quantity, order details. discount, order details. unit price, order. employee ID, order. order Date, order details. product ID from order details, orders, customers where customers. customer ID = order. customer ID and order. order id = order details. order ID and order. order date is not null
DS2: Select category. Category ID, category. category name from category
DS3: Select * from employee
DS4: Select Product. Category ID, product. Product ID from product
Report Template
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M01/49/E4/wKiom1QeRZPQbYHMAACsEuZkpuo065.jpg "Title =" 2014-09-21_112632.jpg "style =" float: none; "alt =" wkiom1qerzpqbyhmaacseuzkpuo065.jpg "/>
Comparison results:In this example, the data volume of the source table is more than 0.4 million, and the same number of SQL statements are used, the following table compares the running time of the report Presentation by using the set computing report test 1 in the report Association 2 using the set computing script Association and passing the result to the report:
650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M02/49/E6/wKioL1QeRbHjRI9fAAA1VvsNpJI195.jpg "Title =" 2014-09-21_112654.jpg "style =" float: none; "alt =" wkiol1qerbhjri9faaa1vvsnpji195.jpg "/>
You can see the advantages of a set computing report in processing associated computing reports. Since a report can only be associated with a report, you can only use the traversal algorithm (search for associated sub-records for a single primary record ), therefore, the efficiency is not high. The set operator adopts a more efficient hash Association Scheme (all subrecords can be hash to the master record according to the corresponding code in advance, and the switch function in the Code uses the hash Association technology, the single-Computing Association time can be 5-10 times faster). Therefore, the performance is improved by more than doubled after the dataset is associated.
In addition, the cube is also very suitable for processing data associations between heterogeneous data sources, such as common multi-database, file, and database hybrid situations.
Run log and test machine configuration.
[Appendix 1] Operation Log
Association in reports
[11:32:59]: [info]-start computing report, first take the number ......
[11:32:59]: [debug]-start the SQL statement below
[11:32:59]: [debug]-ds1 = select customer. region, customer. city, order details. quantity, order details. discount, order details. unit price, order. employee ID, order. order Date, order details. product ID from order details, orders, customers where customers. customer ID = order. customer ID and order. order id = order details. order ID and order. order date is not null
[11:33:35]: [debug]-start the SQL statement below
[11:33:35]: [debug]-ds2 = select category. Category ID, category. category name from category
[11:33:35]: [debug]-start the SQL statement below
[11:33:35]: [debug]-DS3 = select * from employee
[11:33:35]: [debug]-start the SQL statement below
[11:33:35]: [debug]-DS4 = select product. Category ID, product. Product ID from product
[11:33:35]: [info]-end of number fetch and start Operation
[11:34:58]: [info]-computing ended:
Association in DFX
[11:56:33]: [info]-start computing report, first take the number ......
[11:57:11]: [info]-end of number fetch and start Operation
[11:57:26]: [info]-computing ended:
[Appendix 2] test machine configuration
Test Model: Dell target 3420
CPU: Intel Core i5-3210M @ 2.50 GHz * 4
Ram: 4 GB
HDD: WDC (500g 5400 rpm)
Operating System: win7 (x64) SP1
JDK: 1.6
Database: oracle11g r2
Computing report version: 5.0
Associated computing for improving performance using rundry computing reports