Real-Time Report (T + 0) solution using the rundry set computing report, real-time report

Source: Internet
Author: User

Real-Time Report (T + 0) solution using the rundry set computing report, real-time report

In report projects, the customer is paying more and more attention to the real-time nature of source data, and hopes to see the latest data in the report. However, the traditional report tool + Data Warehouse + ETL method is difficult to achieve this. It is often only possible to see the situation yesterday, last week, or even last month, that is, T + 1, T + 7, and T + 30 are collectively referred to as T + n reports. It is difficult to implement T + 0 reports, that is, reports that reflect real-time information.

The reason for the analysis is: 1. If the historical data and the latest data of the report are read from the customer's production system, although the report can be T + 0, it will put pressure on the production database, affects the customer's business. 2. If a data warehouse is used, it takes a long period of "window time" for ETL to extract data from the production database. Generally, after the customer leaves work and before the next morning, therefore, the latest data that the customer can see can only be T + 1. 3. Although theoretically, real-time reports can be generated from both the historical database and the production database, common report tools do not have the ability to calculate the number of cross-database data, other cross-database computing solutions are complex and difficult to implement.

You can consider using the T + 0 report solution provided by the computing report, and use the hybrid data source capability of the computing report to implement low-cost real-time reports. The solution is to store a large amount of historical data that will not change using data files and read a small amount of new data from the production database. This reduces the cost of storing historical data while ensuring real-time reports, reduces the load on the production database caused by the report system. The structure comparison of the traditional T + n solution and the centralized computing report T + 0 solution is as follows:


In the rundry computing report structure, "Export (non-real-time)" refers to synchronizing new data in the production database to historical data files during non-working hours (such as evening. The specific implementation is to use the command line execution method provided by the rundry set calculator, and cooperate with the scheduled task method of the operating system. For details, see the set calculator tutorial.

Here, we use the "State sales statistical table" to illustrate the specific practice of the "T + 0" Statement for the profit collection calculation report. The report is as follows:

The historical sales data volume in the report is large and comes from the data file. At the same time, in order to ensure the real-time row of the report, a small amount of data on the day is taken directly from the production database (db2.

The specific implementation steps are as follows:

The first step is to write the Set Computing script sales-state.dfx in the Set compute.

A1: connect to the pre-configured production database (db2 ).

A2: Create a database cursor and use a simple SQL statement to read sales data and sales personnel data. In the where condition, days (current date) = days (orderdate) indicates that the sales data only reads the new data of the current day.

A3: Create a cursor for the pre-exported data file D:/files/sales. B. File cursors can read data from large data files in batches to avoid memory overflow. The @ B option refers to reading files according to the binary encoding provided by the cube.

A4: vertically combines the database cursor (new data) and the file cursor (historical data.

A5: Use the groups function to complete grouping and summarization of merged cursors.

A6: sort the total sales in descending order.

A7: Disable the db2 database connection.

Step 2: Create a db2 data source and a dataset in the computing report:

Step 3: design the report as follows:

For details about how to create a Statistical Chart in a report, see the "computing Report" tutorial.

It should be noted that the computing reports also support data storage in other ways, such as mongodb, hdfs, or traditional data warehouses. Newly Added data in the production database can be exported using the set Calculator or other ETL tools.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.