Reporting tools for diverse data sources

Source: Internet
Author: User

In the big data era, data is not only massive, but also in various forms and diversified. For Report tools, data must be obtained, computed, and displayed from a variety of data sources. However, most reporting tools are not well adapted to diverse data sources.

Common Report data sources include:

1. Various relational databases: Oracle, DB2, sqlserver, Informix, MySQL, etc;

2. MongoDB, Cassandra, and other nosql databases;

3. JSON and XML data sources;

4. http data sources;

5. Files, common text formats, Excel files;

6. hadoop HDFS;

7. Structured and semi-structured data.

Report tools must adapt to diverse data sources. Powerful and general computing capabilities are critical..


Computing Capability refers to the ability of data sources to store data and provide external computing services, such as data filtering, sorting, grouping, and connection. Among these data sources, only relational databases have strong computing capabilities. You can use SQL statements or stored procedures to write computing scripts. Other data sources are either not capable of computing or weak. Nosql databases have much weaker computing power than relational databases. MongoDB does not support join and subqueries, and the query results cannot be too large. As for text files, Excel, HDFS, XML files, JSON files, and HTTP data sources, there is no computing capability.

Of course, report tools generally provide Custom Data Source interfaces. You can use advanced languages such as Java to compile data source computing programs. Although theoretically strong computing capabilities are available, the Java class library for processing structured semi-structured data is weak, so it is quite troublesome to write a program.

 

As a new-generation report tool, rundry computing provides a built-in computing engine that provides script datasets and powerful data computing capabilities regardless of the computing capabilities of the data source, can be well adapted.

A set calculator is a professional computing tool for structured and semi-structured data. It supports set operations, object references, ordered sets, and other functions. The cube calculator provides rich data computing function libraries to complete multi-step and complex computing. The report collector integrates the IDE (integrated development environment) to debug and observe computing results in one step and breakpoint, effectively improving development efficiency.

A script dataset is a special dataset of a set computing report. You can write a computing script that complies with the set compute syntax and complete data processing after the dataset is retrieved. Suitable for simple data computing with few steps to complete.

In terms of data connection, the built-in set calculator of the Set Computing report can call the JDBC driver of the relational database and nosql database, and also provide function libraries for accessing various other data sources.

In terms of data computing, a set computing report can use a script dataset to provide computing capabilities for data sources without computing capabilities. It can also enhance data sources with weak computing capabilities to achieve relatively simple computing.

Furthermore, the integrated set computing tool can be used to perform unified computing on a variety of heterogeneous data, perfectly solving the problem of insufficient data source computing capability (especially hybrid computing capability. After getting the data from the data source, the set calculator converts it into a unified data object, and then uses a wide range of function libraries for multi-step and complex computing. Finally, the dataset is submitted to the report Presentation engine.

With the feature of supporting diverse data sources, you can build a report system with multiple data sources to take advantage of the advantages of various data sources and avoid their respective shortcomings. A typical data source structure of the report system is as follows:

650) This. width = 650; "src =" http://s3.51cto.com/wyfs02/M02/49/D4/wKiom1QbvubBOhBYAACfBZDU3gs085.jpg "Title =" 2014-09-19_132756.jpg "alt =" wkiom1qbvubbohbyaacfbzdu3gs085.jpg "/>

The construction and operation costs of SQL dB in the figure are relatively high. It can be used to store a small amount of real-time data, such as the data of the current day or the current month. The data volume is small, and there are high requirements for data consistency during changes.

Local files can be used to store recent historical data, such as current year data. Large data volume, no change, high usage.

HDFS files can be used to store long-term historical data, such as previous years. The data volume is huge and the usage is low.

Nosqldb, JSON data, and HTTP data sources may be data interfaces provided by the external system to read external data that the system cares about.

After reading and Calculating data from all data sources through the built-in set calculator, the Set Computing report is submitted to the presentation engine through the dataset, and the reports with rich content and rich texts are generated and presented to the end user.

Such a report system has the following advantages: it can not only ensure good real-time data, but also reflect historical data rules. It can not only guarantee big data storage, but also avoid excessive investment; it can display the internal data computing results of the Application and view the data of the relevant application system.


Reporting tools for diverse data sources

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.