Store data outside the database or file system to improve the performance and performance of the report system.

Source: Internet
Author: User

Store data outside the database or file system to improve the performance and performance of the report system.

In report applications, the proportion of reports queried for historical data is large. Such reports have the following features: first, the data changes are small, and the queried historical data is almost unchanged. Second, A large amount of data increases with the time span. If the data is always stored in the database, the JDBC performance of most databases is very low (Data Object conversion is required during the JDBC fetch process, which is an order of magnitude slower than reading data from the file ), when the data volume is large or the concurrency is large, the performance of the report decreases sharply. If you can remove the infrequently changed historical data from the database and use file system storage, the IO performance may be much higher than that of the database, thus improving the overall performance of the report.

However, the report does not directly use raw data for presentation, and further operations are required. The file itself does not have the computing power. In this case, the data volume is usually large and cannot be achieved by the computing capability of the report Presentation end.

With the help of the built-in computing engine, a set computing report can be computed based on files outside the database. Supported file types include text, Excel, and JSON files, it also supports more efficient binary files. Remove large amounts of historical data from the database. This not only satisfies the performance requirements of Historical query reports, but also supports hybrid data sources (Files + databases) using the set computing reports) supports Real-time Data Query of large data volumes, reads historical data from the file system, and reads real-time data from the database for hybrid computing. This method can avoid the I/O bottleneck of the database, quickly improve the report performance, and increase the scope of data query. At the same time, this is also the process of optimizing the database, removing historical data, the database can focus on ensuring the Data Consistency of the business system, rather than consuming resources on a large number of Historical query tasks.

When you save data to the file system, you can use the computing report to query and compute data, the built-in set computing engine of the Set Computing report can compute data based on files (and databases. For more information, see the following steps (for example ):

1. Export historical data from the database to a file

You can select an appropriate method to export historical data to a file. Of course, this process can also be done using a set calculator to export data to text. If you want higher performance, centralized computing reports can also support more efficient binary file formats (2-5 times faster than text instinct ). Run the code similar to the following in the collector (in the free version) to convert the text file to a binary format:

File ("E:/order details. B"). export @ B (file ("E:/order details .txt". cursor ())

 

2. Read data files using the built-in set computing engine of the Set Computing report

When the data is external, the use of the Set Computing report is like using the file data source for the report, for example, according to the order details data by the customer to calculate the order quantity and order amount, because the original order data is very large, therefore, when reading a file, the stream (File cursor) method is used to gradually read the file.

The parameters used in the script and their meanings are as follows:


Script:

A1: stream processing is used to read large source text data through file cursors;

A2: data is filtered Based on Multiple specified dimensions, and the result is still a cursor;

A3: Based on the selected results, the order quantity and order amount are summarized based on the customer ID;

A4: return result set of the report.

As mentioned above, a computing report can query and compute individual files (historical data), and perform file + database hybrid operations to query large data volumes in real time.

A1-A3: Aggregates historical data like the previous script;

A5: Execute the SQL statement based on the specified parameters to summarize the current data;

A6: Merge the summarized data of the two parts (vertically spliced );

A7: summarizes the order quantity and order amount of each customer based on the merged history and current summary data.

 

3. Call the Set Computing script in the Set Computing report and edit the report expression to complete report creation.

 

Through the above process, we can clearly see that the computing report can solve the problem of poor performance when querying historical (+ current) data, improve report performance through external data.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.