A Simple and Easy join method for complex multi-data source reports to be computed after join

Source: Internet
Author: User

A Simple and Easy join method for complex multi-data source reports to be computed after join

Complex data sources are common issues in report development. For example, join operations are performed on different database tables before filtering and grouping and sorting. JasperReport/Birt and other reporting tools include virtual data source or table join, which can be used to perform multi-data source join post-computation to a certain extent, but it is not easy to master.

The collection and computing appliance has a structured and strong computing engine that supports diverse data sources and is easy to integrate. It can help report tools easily achieve such requirements. The following example shows the computation process after joining multiple data sources.

Sales is a table in the mysql database. It stores multiple orders from multiple salespeople every day. The field SellerId is the salesperson ID. Emp is a table in the mssql database and stores the salesperson information. The field EId is the salesperson ID, the Name is the salesperson Name, And the Dept is the department Name. The order number, date, amount, salesperson name, and department name must be displayed in the report, provided that the order date is in the last N days (such as 30 days) or the order belongs to several concerned departments (such as Markeding and Finance ).

Because the order number, date, and amount come from the sales table, and the salesperson name and department name come from the emp table, join operations must be performed between different databases, and conditional filtering must be performed after the join operation. Some source data is as follows:

Table sales


Table emp

Code of the Set calculator:

A1 = myDB1.query ("select * fromsales ")

This code queries the records of the sales table from the data source myDB1, and myDB1 points to the mysql database. Function query is used to execute SQL queries and can receive external parameters. The calculation result of A1 is as follows:

A2 = myDB2.query ("select * fromemp ")

This code queries the records of the emp table from the data source myDB2. myDB2 points to the mssql database.

A3 = A1.switch (SellerId, A2: EId)

The above code replaces the SellerId field in A1 with the corresponding record in A2, and the associated field is EId. The calculation result of A3 is as follows (the Blue font indicates that the data item contains lower-level members ):

When no corresponding record is found in A2, function switch retains records in A1 by default. The corresponding SellerId is displayed as null, and the result is similar to the left join. To establish an internal connection, use the option @ I, for example, A1.switch @ I (SellerId, A2: EId)

A4 = A3.select (OrderDate> = after (date (now (), days *-1) | depts. array (). pos (SellerId. Dept ))

The code above can filter the association results. There are two conditions, and the 1st condition is: The order date is in the last N days (corresponding to the parameter days ), the expression is OrderDate> = after (date (now (), days *-1 ). The 2nd condition is that the order belongs to several concerned departments (corresponding to the depts parameter), and the expression is depts. array (). pos (SellerId. Dept ). Operator | indicates the logical relationship "or ".

The function now can take the current time. The function date converts the current time to the date. The function after can calculate the relative time. For example, after ("",-30) indicates that the time is 30 days after, 2015-01-01. With different options, the function after can also calculate relative time in the unit of year, quarter, month, and second.

The Function array can convert a string into a set by separator, for example, "Marketing, Finance". array () equals to ["Marketing", "Finance"]. The default delimiter of the Function array is a comma. You can also specify other separators. The pos function can locate the position of a Member in the set. For example, ["Marketing", "Finance"]. pos ("Finance") is equal to 2, and the logical relationship is equal to true. If the member is not in the Set, null is returned, and the logical relationship is equal to false.

It is worth noting that SellerId. Dept indicates the Dept field of the record corresponding to the SellerId field. We can see that after the field is replaced by a switch, the association relationships between tables can be accessed using objects. This method is intuitive and simple, and is more obvious when multiple tables are associated with multiple layers.

Both Days and depts come from the report parameters. If you enter 30, "Marketing, Finance", the A4 result is as follows:

A5 = A4.new (OrderID, OrderDate, Amount, SellerId. Name: Name, SellerId. Dept: Dept)

The above code retrieves the fields required for the report from A4. SellerId. Name and SellerId. Dept indicate the employee Name and department Name in the emp table respectively, and the operator ":" indicates renaming. The calculation result of A5 is as follows:

So far, all the data required by the report is calculated. Finally, you only need to use result A5 to return the reports table in A5 to the report tool. The assembler provides a JDBC interface. The report tool recognizes the assembler as a common database. For the integration solution, see related documents.

 

Next, take JasperReport as an example to design a simple report, as shown in the following table:

Two report parameters pdays and pdeps need to be defined, which correspond to two parameters in the Set calculator respectively. After previewing, you can see the report results:


The method of the report call set calculator is the same as that of the call stored procedure. For example, if you save this script as afterjoin1.dfx, you can use afterJoin1 $ P {pdays} in the SQL designer of JasperReport }, $ P {pdepts.

 

With the assistance of esProc, the report tool can also process more complex data sources after join computing. For example, find the three fastest growing days for each salesperson after a certain date to show the salesperson's name, date of the three days, sales volume, and growth rate.

Code of the Set calculator:

A1 = myDB1.query ("select * from sales whereOrderDate >=? ", BeginDate)

The code above is used to query orders after a date in the table sales. beginDate is a parameter sent from the report. If the value is "", the calculation result of A1 is as follows:

A2 = myDB2.query ("select * from emp ")

The above code queries the emp table and the results are as follows:

A3 = A1.switch (SellerId, A2: EId)

The above code replaces the SellerId field in A1 with the corresponding record in A2. The result is as follows:

A4 = A3.group (SellerId)

The above code groups orders by SellerId. The left side is the A4 calculation result, and the right side is the order of two sellerids.

A5 = A4 .(~. Groups (OrderDate, SellerId; sum (Amount): subtotal ))

The above code groups each SellerId order by OrderDate and SellerId, and summarizes the order amount of each group, that is, the daily sales of each salesperson. For example:

In the above Code, "A4. ()" indicates that every A4 member is calculated cyclically, and "~" in parentheses Is a member variable, that is, the order record corresponding to an SellerId. "~. "Groups ()" indicates that the groups function is applied to each member. The function groups can group and summarize data. The function group can only group and not summarize data.

 

A6 = A5 .(~. Derive (subtotal-subtotal [-1])/subtotal [-1]: rate ))

The code above calculates the daily growth rate of each Salesperson's sales. The calculation result is as follows:

In the code above, the derive function is used to add a new field named rate, and the algorithm is "(sales of the current day-Sales of the previous day)/sales of the previous day ". As you can see, the cube operator uses subtotal [-1] to represent the sales volume of the previous day, which allows you to easily calculate relative positions.

It is worth noting that because the first record does not have the corresponding "sales of the previous day", the sales growth rate is Null.

 

A7 = A6 .(~. Select (#! = 1 ))

This Code removes the first record of each group of data based on A6 (because the growth rate of the first record is meaningless Null ).

The select function in the code can be queried. "#" indicates the cyclic sequence number, and "#! = 1 ", that is, the serial number is not equal to 1. The same function can also be implemented using the function delete, but the performance is slightly lower, because the function select only returns the reference, and delete needs to change the actual data.

 

A8 = A7 .(~. Top (-rate; 3 ))

The above code can calculate the three-day corresponding record with the maximum sales growth rate for each salesperson. Function Top can retrieve the first N records based on a field (or expression of several fields. The calculation result is as follows:


A9 = A8.union ()

The code above combines the data of each group in A8 to form a new lifecycle table A9, as shown below:


A10 = A9.new (SellerId. Name: Name, OrderDate, subtotal, rate)

The above code extracts the required fields from A9, that is, the final calculation result of this case.


Result A10

The preceding Code returns the "tables" table in A10 to the report tool. For more information about the design of reports, see the previous case.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.