Improve report system performance by putting data out of the database or file system

Source: Internet
Author: User

In the report application, for the historical data query the proportion of the report is very large, this kind of report is characterized by: first, the data changes little, the historical data of the query will hardly change; second, the data volume is large, the data volume increases with the time span increasing. If the data is always in the database, because the JDBC performance of most databases is very low (the JDBC fetch process is to do the data object conversion, which is one order of magnitude slower than reading the data from the file), when the amount of data involved is large or when there is more concurrency, the performance of the report drops sharply. If you can move historical data with little change out of the database and file system storage, you will probably get much higher IO performance than the database, thus improving the overall performance of the report.

However, the report is not rendered directly using the original data, and further operations are required. And the file itself does not have the ability to compute, this situation is generally larger data volume, it is not possible to rely on the calculation of the report presentation side of the implementation.

With the help of the built-in set-up engine, the collection report can be calculated based on out-of-Library files, supported file types include text, Excel, JSON format files, and more efficient binary files. The large data volume of historical data from the database stripping, in addition to meet the Historical query class report performance requirements, but also with the help of the Integrated data source (file + database) support, the large data volume of real-time data query, from the file system to read large historical data, A mixed calculation is done by reading the real-time data from the database with a small current period. This approach avoids the IO bottleneck of the database, improves report performance quickly and increases the scope of data query. At the same time, it is also the process of optimizing the database, moving the historical data out, the database can focus on ensuring the consistency of the business system data, rather than wasting resources on a large number of historical query tasks.

In particular, when the user saves the data to the file system, the data can be queried and calculated using the set-up report, and the set-up engine built into the integrated report is calculated based on the file (and database). When using, you can refer to the following steps (for example):

1. Export historical data from the database to a file

Users can choose the appropriate method to export historical data to a file, of course, the process can also use the collector to do, you can export data to text, if you want higher performance, the integrated report can also support more efficient binary file format (2-5 times faster than text). You can convert a text file to a binary format by performing similar code in the Collector (available as a free version):

File ("e:/Order details. B") [email protected] (file ("e:/Order details. txt". Cursor ())

2. Read data files using the set-up engine built into the collection report

When the data is external, the use of the collection report is like using a file data source to do the report, such as according to the order detail data according to customer statistics order quantity and order amount, because the original order data is very large, so read into the file using streaming (file cursor) way to read in.

The parameters used in the script and their meanings are as follows:


Script:

A1: Read the large source text data by using the file cursor in streaming mode;

A2: Data filtering by the specified number of dimensions, the result is still a cursor;

A3: According to the selected results, according to the Customer ID summary order quantity and order amount;

A4: Returns a result set for the report.

As mentioned earlier, the aggregate report can be used for a single file (historical data) query calculation, but also for file + database mixed operations, large data volume real-time query.

A1-a3: As with the previous script, summarize historical data;

A5: Execute SQL According to specified parameters, summarize current data;

A6: Merging two pieces of aggregated data (vertical stitching);

A7: The order quantity and the order amount of each customer are collected again according to the combined history and the aggregated data of the current period.

3. Call the Set calculation script in the collection report, edit the report expression to complete the report production

Through the above process can be clearly seen, the set of the report can be a good solution for the past period (+ when) data query often have a low performance problem, through the data external to improve the performance of the report.


Improve report system performance by putting data out of the database or file system

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.