Hadoop for report data sources

Last Update:2015-04-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The data source types supported by the collection report, in addition to the traditional relational database, also support: txt text, Excel, JSON, HTTP, Hadoop, MongoDB, and so on.

For Hadoop, the collection report provides direct access to hive, as well as reading data from HDFs to complete data calculation and report development. The access to hive is the same as using JDBC as a normal database, which is not discussed here. The following is an example of the process of direct access to HDFS.

Report Description

Stock transactions are stored as text in HDFs on a monthly basis, with the file name Stock_record_yyyymm.txt (such as Stock_record_200901.txt), which includes the stock code, the trading date, and the closing price. Query and calculate the closing average price of each stock according to the specified month for stock trend analysis. The text reads as follows:

Code tradingdate Price

120089 2009-01-0100:00:00 50.24

120123 2009-01-0100:00:00 10.35

120136 2009-01-0100:00:00 43.37

120141 2009-01-0100:00:00 41.86

120170 2009-01-0100:00:00 194.63

Unlike the general reporting tools, the aggregate report can directly access the read calculation of the HDFS completion data, the following is the implementation process.

Copy related jar packages

you need to load Hadoop core packages and configuration packages when accessing HDFS using a set report, such as: Commons-configuration-1.6.jar, Commons-lang-2.4.jar, Hadoop-core-1.0.4.jar (Hadoop1.0.4). Copy the above jar to the [Collection report installation directory]\report\lib and the "Collector installation directory]\esproc (if you need to edit and debug the script using the Collector editor).

writing a calculation script

Write a script (STOCKFROMHDFSTXT.DFX) using the collection Editor, complete the file read-in and data filtering for HDFS, and return the result set for the report. Because you want to receive parameters for report delivery, you first set the script script parameters.

Edit the script.

A1: Use the Hdfsfile function to create HDFs file cursors based on file path and specified parameters;

A2: Summary closing price and quantity for stock code;

A3: Calculates the average close price for each stock, returning a result set for the report through A4.

edit a report template

Create a new report template using the Set Report Designer and set the parameters:

Set the dataset, use the "collector" dataset type, and invoke the edited script file (stockfromhdfstxt.dfx).

Where the DFX file path can be either an absolute path or a relative path, relative paths are configured in the relative option of the DFX home directory.

Edit the report expression to complete the report production directly using the result set returned by the collection script.

It is important to note that when previewing in Report Designer, you will need to copy the Hadoop-related jar package to the []\report\lib] under the Set Report installation directory.

In addition to the text files that directly access HDFs, the collection report can read the compressed files in HDFs. The Hdfsfile function is still used, and the extension determines the decompression method. For example, to access the gzip file you can write:

=hdfsfile ("Hdfs://192.168.1.210:9000/usr/local/hadoop/data/stock_record_" +d_date+ ". GZ", "GBK"), Simply include the extension in the URL.

As can be seen through the above implementation, the use of the collector script can easily complete the calculation of the HDFs file read, and the external set of the script has a visual editing debugging environment, the edited script can also be reused (by other reports or programs called). However, if the script has been debugged and does not need to be reused, it would be cumbersome to maintain the consistency of the two files (both the set and report templates), which makes it easier to use the script dataset directly for the set report.

In the script dataset, you can step through the script to complete the calculation tasks, the syntax is consistent with the concentrator, and you can use the report-defined data source directly (not covered in this example) and parameters. This can be done using a script dataset:

1. Click the "Add" button in the DataSet Settings window to pop up the dataset Type dialog and select "Script Data Set";

2. Write the script in the Popup Script dataset editing window;

Use the parameter arg1 of the report definition directly.

3. Report parameter settings and report expressions, consistent with the use of the collector dataset, are no longer mentioned.

When you deploy a report, you also need to put the relevant jar of Hadoop into your app classpath, such as the web-inf\lib of your app.

Collection Report Download: http://www.raqsoft.com.cn/?p=208.

Hadoop for report data sources

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop for report data sources

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hadoop for report data sources

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support