A simple method for calculating complex multi-data source report after join

Source: Internet
Author: User

Complex data sources are common problems in report development, such as the join operation of different database tables, and the subsequent filtering grouping sorting. Jasperreport/birt and other reporting tools, such as virtual data source or table join, can be used to achieve a certain degree of multi-data source join calculation, but it is not easy to grasp.

With a structured and strong computing engine that supports a variety of data sources, the integration is simple and can assist reporting tools to easily implement such requirements, an example is given to illustrate the process of calculating a multi-data source join.

Sales is a table in the MySQL database that stores multiple orders per day for several salespeople, where field Sellerid is the salesperson number. EMP is a table in the MSSQL database that stores the salesperson information, where the field Eid is the salesperson number, name is the salesperson's name, and dept is the department name. Now it needs to be shown in the report: Order number, date, amount, salesperson's name, department name, if the order date is in the last n days (say 30 days) or the order belongs to a number of interested departments (such as Markeding and finance).

Because the order number, date, and amount are from the table sales, and the salesperson's name and department name are from the table EMP, the join operation between the different databases is also conditional filtering after the join. Some of the source data are as follows:

Table Sales


Table emp

The Collector code:

A1=mydb1.query ("SELECT * fromsales")

This code queries the record of the sales table from the data source myDB1, myDB1 points to the MySQL database. The function query is used to execute an SQL query that can receive external parameters. The A1 calculation results are as follows:

A2=mydb2.query ("SELECT * fromemp")

This code from the data source myDB2 query out the EMP table records, MyDB2 point to the MSSQL database.

A3=a1.switch (Sellerid,a2:eid)

The code above replaces the Sellerid field in A1 with the corresponding record in A2, which is associated with an Eid. A3 evaluates to the following (the blue font indicates that the data item contains subordinate members):

When the corresponding record is not found in the A2, the function switch retains the record in A1 by default, and the corresponding Sellerid appears empty, and the effect is similar to the left connection. If you want to make an internal connection, you should use the option @i, such as:[email protected] (Sellerid,a2:eid)

A4=a3.select (Orderdate>=after (now ()), days*-1) | | Depts.array (). POS (sellerid.dept))

The above code can filter the results of the association, there are 2 conditions, the 1th condition is: The order date in the last n days (corresponding to the parameter day), the expression is orderdate>=after (date (now ()), days*-1). The 2nd condition is that the order belongs to a number of concerned departments (corresponding to parameter depts), and the expression is Depts.array (). POS (sellerid.dept). operator | | Represents a logical relationship "or".

The function now can take the current time, the function date converts the current time to a date, and the function after can calculate the relative time, such as after ("2015-01-30", -30) means the time back 30 days, that is, 2015-01-01. With different options, the function after can also calculate relative time in years, seasons, months, and seconds.

The function array can change the string into a collection by delimiter, such as "Marketing,finance". Array () equals ["Marketing", "Finance"]. The default delimiter for the function array is a comma, or you can specify a different delimiter. Function Pos can find out where members are in the collection, such as ["Marketing", "Finance"].pos ("Finance") equals 2, and is equal to true in a logical relationship. Returns NULL if the member is not in the collection, equal to False in the logical relationship.

It is worth noting that this usage of sellerid.dept, which represents the Dept field of the Sellerid field corresponding to the record. As you can see, when you replace a field with a switch, the relationship between the tables can be accessed in the same way as the object, which is straightforward and more noticeable when multiple-table multi-level associations are used.

Days and depts are from the parameters of the report, if you enter 30, "Marketing,finance", then the results of A4 are as follows:

A5=a4.new (orderid,orderdate,amount,sellerid.name:name,sellerid.dept:dept)

The code above obtains the fields required by the report from A4, where Sellerid.name and sellerid.dept respectively represent the employee name and department name in the EMP table, and the operator ":" represents the rename. The A5 calculation results are as follows:

So far, the data needed for the report is all calculated. Finally, just use result A5 to return the two-dimensional table in A5 to the report tool. The collector provides a JDBC interface, and the report tool recognizes the collector as a normal database, and the integration scheme is referenced in the relevant documentation.

Next, take Jasperreport as an example to design a simple report that looks like this:

You need to define two report parameters Pdays, Pdeps, respectively, corresponding to the two parameters in the collector. You can see the results of the report after previewing:


The report calls the collector the same way you call a stored procedure, such as saving this script as AFTERJOIN1.DFX, you can use afterJoin1 $P {pdays} in the Jasperreport SQL designer, $P {pdepts} To invoke.

With the help of Esproc, the report tool can also handle more complex multi-data sources after the join calculation. For example: Find the sales of each salesperson after a certain date the fastest increase of three days, showing the salesperson name, the three days of the date, sales, growth rate.

The Collector code:

a1=mydb1.query ("select * from Sales whereorderdate>=?", Begindate)

The above code is used to query the order after a date in the table sales, where begindate is the parameter passed from the report, assuming the value is "2015-01-01", then the A1 evaluates as follows:

a2=mydb2.query ("SELECT * from emp")

The above code queries the EMP table with the following results:

A3=a1.switch (Sellerid,a2:eid)

The code above replaces the Sellerid field in A1 with the corresponding record in A2, with the following results:

A4=a3.group (Sellerid)

The above code groups orders by Sellerid. The left side is the result of the A4 calculation, and the right side is the order of two Sellerid.

a5=a4. (~.groups (Orderdate,sellerid;sum (Amount): Subtotal))

The above code groups each Sellerid order in OrderDate and Sellerid, and summarizes the order amounts for each group, which is the sales per day for each salesperson. Such as:

In the code above, "A4. () "means that each member of the A4 is cycled, and the" ~ "in parentheses is a member variable, which is the order record for a sellerid. "~.groups ()" means that the function groups is applied to each member. Function groups can group data and make a simple summary, the function group can only group not summarize.

A6=a5. (~.derive ((subtotal-subtotal[-1))/subtotal[-1]:rate))

The above code calculates the daily growth rate of sales for each salesperson, and calculates the following results:

In the code above, the function derive is used to add a new field, with the field named rate, with the algorithm "(current sales-sales on the previous day)/sales for the previous date". As you can see, the Collector Subtotal[-1] represents the sales for the previous day and can be easily calculated relative to the position.

It is worth noting that the sales growth rate is null because the first record does not have a corresponding "Sales for the previous day".

A7=a6. (~.select (#!=1))

This code removes the first record of each group of data based on A6 (because the growth rate of the first record is meaningless null).

The function in the code select can be queried, "#" for The Loop ordinal, "#!=1" that is, the ordinal is not equal to 1. The same functionality can be implemented with function delete, but the performance is slightly lower because the function select only returns the reference, and delete needs to change the actual data.

A8=a7. (~.top (-rate;3))

The code above calculates the record for each salesperson's three-day maximum sales growth. The top of the function can take the top N records based on a field (or an expression of a few fields). The calculation results are as follows:


a9=a8.union ()

The above code merges the data from the A8 to form a new two-dimensional table A9, as follows:


a10=a9.new (sellerid.name:name,orderdate,subtotal,rate)

The code above removes the required fields from A9, which is the result of the final calculation of this case.


Result A10

The code above returns the two-dimensional table in A10 to the report tool. The design of the report can refer to the previous case, which is not mentioned here.


An easy way to calculate a complex multi-data source report after join

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.