Text Files of diverse data sources

Source: Internet
Author: User

Text Files of diverse data sources

Diverse data sources are becoming more and more common in report development. the effective support of the collection and computing reports for diverse data sources makes the development of such reports very simple, currently, in addition to traditional relational databases, the data source types supported by computing reports include TXT text, Excel, JSON, HTTP, Hadoop, and mongodb.

Here we use two examples to introduce the use of the Set Computing report.Text Data SourceThe report creation procedure allows you to process small text files and large text files in a set computing report in different ways.

Small text file report description

Stock transaction records are stored on a monthly basis. The file name is stock_record_yyyymm.txt(for example, stock_record_200901.txt). The text includes the stock code, Transaction date, and closing price. You can query the closing prices of one or more stocks based on the specified date to analyze the stock price trend. The text content is as follows:

Code tradingDate price

120089 2009-01-0100:00:00 50.24

120123 2009-01-0100:00:00 10.35

120136 2009-01-0100:00:00 43.37

120141 2009-01-0100:00:00 41.86

120170 2009-01-0100:00:00 194.63

The report style is as follows:


<喎?http: www.bkjia.com kf ware vc " target="_blank" class="keylink"> VcD4KPHA + 5E + 1vbxevbvs19dfz6kjuw.vcd4kpha + pgltzybzcm9 "http://www.2cto.com/uploadfile/Collfiles/20150506/2015050610101616.jpg" alt = "\">

Because the stock price information of one day only exists in one file (stored on a monthly basis), the data volume of a single file is not large, so you can load the file into the memory at a time to complete data query. Here we will also be able to read files in memory for computing at a time called small text files. The specific implementation is as follows:

Write computing scripts

Use the computing editor to write a script (p1.dfx) to complete file reading and data filtering, and return the result set for the report. To receive the parameters passed in the report, set the Script Parameters first.


Edit the script content (the result after running the grid on the right ):

A1: import a specified file (one) based on the date parameter. f is used here. import () reads text data into the memory at a time and completes data computing in full memory. This is also a common method for processing small files;

A2: Query transaction records based on the specified date and stock code;

A3: return result set of the report.

Edit a Report Template

Use the report designer to create a report template and set parameters:

Set the dataset, use the dataset type, and call the edited script file (p1.dfx)

The dfx file path can be either an absolute or relative path, and the relative path is the dfx home directory configured in the relative options.

Edit the report expression and directly use the result set returned by the Set Computing script. The result set is no longer filtered out from the report to complete report creation.

Through the above implementation, we can see that using the set computing script can easily read and compute text files, and the external set computing script has a visual editing and debugging environment, the edited script can also be reused (called by other reports or programs ). However, if the script has been debugged and does not need to be reused, It is troublesome to maintain consistency between the two files (the Set Computing script and Report Template, in this case, it is easier to directly use the script dataset of the Set Computing report.

In the script dataset, you can write a script step by step to complete the computing task. The syntax is the same as that of the Set calculator. You can also directly use the data source defined in the report (not included in this example) and parameters. In this way, you can use a script dataset to replace the dataset (Set Computing script) Section (the report parameters, expressions, and other parts are exactly the same as the dataset using the set computing tool, and will not go into details ):

Directly use the date and code parameters defined in the report.

Large text file

In addition to using small text files as report data sources, the computing report can also read large text files (cannot be read into memory for computing at a time ). Unlike the small text processing method, the centralized computing report adopts the external storage computing method to handle large files. Here we also use examples to describe them.

We need to modify the preceding report requirements to query some stock transaction information for a specified period of time. Because the time span can be large or small, there may be a lot of files to be read. In this case, you cannot load multiple files into the memory for computing at a time. You need to use the external storage computing method that processes large text. The specific implementation is as follows:

Write computing scripts

Set script parameters.

Edit the script content (the result after running the grid on the right ).

A1: Calculate the month to be queried Based on the date range to determine the files used;

A2: loop all months, using f. cursor () creates a file cursor and uses cs. conj @ x () combines multiple cursors into one. import () memory reading is different at a time. The file cursor is only a reference to the external storage file and does not actually read data;

A3: the cursor is still returned after filtering based on parameters;

A4: Use cs. fetch () to retrieve the result from the cursor and return it to the report.

Edit a Report Template

Set Report parameters.

Set a dataset.

Edit a report expression.

The preceding steps can be used to read and compute large files. Centralized computing reports provide different processing methods for internal and external storage to meet different report requirements, and help report development for text file data sources.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.