"Reprint" Four kinds of BI open source tools introduction-spagobi,openi,jaspersoft,pentaho

Source: Internet
Author: User
Tags jboss

Four types of BI open source tools introduction-spagobi,openi,jaspersoft,pentaho1 Brief Introduction to BI systems

From a technical point of view BI includes ETL, DW, OLAP, DM and other links. Simply put, the transaction system has already occurred data, through the ETL tool extracted to the subject of a clear data warehouse, after the OLAP generation cube or report, through the portal to show users, users use these categorized, aggregated, descriptive and visual data, to support business decisions.

These numerous BI projects can be broadly divided into three types of framework, stand-alone tools, and bi suit, both in terms of scale and the level of support for BI systems.

    • Framework

Open source framework, which is not in the business BI system. We can use them to build our own bi tools, or to enhance and extend our BI solutions.

    • Stand-alone Tools

Stand-alone BI tool, which is the largest number of open source projects in the category. Many tools focus on only one aspect of the BI system, such as ETL, Report, OLAP, database, and so on.

    • BI Suit

A collection of tools that provide the features of multiple BI systems in a unified architecture. As it stands, no one suite offers a complete end-to-end BI solution, whether it's commercial or open source software. These open source bi suit are formed by connecting multiple other components and tools, and because the BI system involves a lot of tools, it is difficult to integrate a complete BI solution.

2 tools in a BI solution

A complete BI solution has a variety of tools to complete the work of each stage of the BI system.

2.1 ETL Tools

Data extraction, transformation, and loading tools. A good ETL tool should have the following features:

    1. Workflow Management, Job execution and scheduling Manager. Can easily define the process and automate the execution of ETL tasks;
    2. Centralized Metadata Repository and Management. Centralize storage and management of industry-standard meta-data;
    3. Data profile and Validation. Can test the quality of the data;
    4. High performance. Still have good performance in the task execution of heavy load;
    5. Scalable, Platform Independent. With good elasticity, support a variety of operating systems and database systems, can operate a variety of heterogeneous data sources;
    6. Open Architecture and API. has an open architecture and easy-to-use two-time development interface.

Currently more well-known open source ETL tools are:

    1. Ketl, developed by Kinetic Networks company with IBM and KPMG backgrounds, now has more than three years of product application history and has been successfully applied to a number of products, performing well in Clickstream (Clickstream) analytics applications. Ketl uses the plug-in architecture, using Java development;
    2. Kettle, a metadata-driven ETL tool. has joined Pentaho;
    3. Clover ETL, as a Java-based ETL Framework, can be used to develop their own ETL applications;
    4. Enhydra Octopus, a Java-based ETL tool, uses JDBC to connect a variety of data sources for ease of use and deployment. It has been used in Telecommunication network resource Analysis System.
2.2 Reporting tools

Excellent reporting tools typically have the following characteristics:

    1. Support a variety of data sources;
    2. Intuitive visual designer, easy-to-use report customization function;
    3. Convenient data access and formatting, rich data presentation method;
    4. Conform to the common standard of data rendering, and can be well combined with the application;
    5. Easy to scale and deploy;

Currently more well-known open source reporting tools are:

    1. JasperReports, a good Java reporting tool, started in 2001 and now Jaspersoft continues to develop and support the tool. Similar to the commercial software Crystal Report, this tool supports PDF, HTML, XLS, CSV, and XML file output formats, and is now the most common reporting tool for Java developers.
    2. OpenReports provides a flexible web-based reporting solution that automatically generates dynamic Pdf,xls,htmlcsv and chart reports through a browser, developed in Java, using JasperReports as the reporting engine. The use of open source technology has hibernate,veloctiy,webwork;
    3. Jfreereport, now a part of Pentaho, is an excellent Java class library for generating reports. It provides a flexible printing capability for Java applications and supports output to printers and PDFs, Excel, HTML and XHTML, plaintext, XML and CSV files;
    4. Eclipse BIRT, an enterprise intelligence and reporting tool under Eclipse, enables the creation of beautiful, eye-catching PDF or HTML-formatted reports for Java EE Web applications, which provides the core reporting capabilities.
2.3 OLAP Tools

Online analysis processing tool. Currently open source OLAP tools are also divided into MOLAP (multidimensional), ROLAP (relational) and HOLAP (hybrid), excellent OLAP tools often have the following features:

    1. Good execution performance, can carry out analysis and processing work quickly;
    2. Good adaptability and scalability;
    3. Open interface and rich API;

Currently more well-known open source OLAP tools are:

    1. Mondrian, a part of Pentaho, is a Java-developed OLAP server that implements the MDX language, XML parsing, and Jolap specification, and can analyze large datasets stored in SQL databases without writing SQL. The JDBC data source can be encapsulated and the data presented in multidimensional ways;
    2. JPivot is a JSP self-customizing tag library that can draw an OLAP table and chart. Users can perform typical OLAP navigation, such as drills, slices, and squares. It uses Mondrian as its OLAP server. It uses the WCF (Web Component Framework) to render Web UI components based on XML/XSLT. JPivot's simplistic, monolithic initialization of metadata caching will limit it to only a small cube (cube).
2.4 Database

There are many open source databases, most of which are relational databases, and a few are dedicated to optimizing the Data Warehouse environment. Based on PostgreSQL, Bizgres optimizes the data warehouse environment and improves the performance of analysis query.

3 Open Source Bi Suite

The following is a list of open source Bi suites that are relatively mature and complete and useful for reference.

Openi

Openi is a Java-developed web application that can analyze and report on OLAP servers, relational databases, and data mining servers, is easy to use and deploy, has a friendly interface, and will support data mining and ETL in the future. Openi mainly includes:

    1. OLAP Show: JPivot
    2. Reporting tools: Jfreechart
    3. Analyzing Data source Connectors

Openi Architecture:

RDL is report Define Language
Openi has most of the features that a bi should have,
Report:jasperreport, Jfreechart
Olap:mondrian + JPivot
Data Mining:weka
Its layers converge very tightly, as if using the eigenbase to do data management, not very clear this part, Openi in doing data mining when it does not have a scheduler, its portlet Interface Mainly refers to the use of JPivot when jpivot can be used everywhere openi without their own development of proprietary tools, entry threshold is relatively low.

Jaspersoft

Jaspersoft Business Intelligence Suite is built on the basis of the module, so it is easy to set up to prove its incremental value. Jaspersoft mainly includes:

    1. Jasperserver: Interaction for business users, specific and scheduled queries and reporting servers
    2. Jasperanalysis: Providing OLAP data analysis for business users ' interactions
    3. Jasperetl: High-performance graphical data integration for developers and database administrators
    4. JasperReports: Java report function library for Developers

Jaspersoft the most important is its report, but it supports the output of a lot of formats, the management of a lot of ways, but also used eigenbase to do data management.

There are more perfect permissions control, with the Acegi, support a variety of data sources, as long as the JDBC driver. Its products have formed a product line, the most famous of course it is jasperreport.

You can see it in order to better manage the various reports and data, has its own dedicated platform Jasperserver, this platform was created in 06/26/2006, is completely jaspersoft in order to achieve bi and an important step. Jasper No data mining.

Has the Task Scheduler, uses the quartz;
Have their own exclusive etl:jasperetl;
It has its own OLAP server:jasperanalysis;
The display layer uses AJAX and applets, there are also dashboard;
Query statements support SQL, Hibernate (HQL), XPath (XML), EJBQL, MDX (Multidimensional query language, OLAP-specific, SQL Server with XMLA).

SpagoBI

The SpagoBI integrates Mondrain and jprovit to generate real-time reports through OpenLaszlo. SpagoBI uses Java development, does not rely on the specific operating system, has the very strong expansion ability. It mainly includes:

    1. Reporting tools: Jasperreports/eclipse Birt/ireport
    2. OLAP Server:mondrian
    3. OLAP Show: JPivot
    4. Data Mining components: Weka
    5. Map Engine: Geo
    6. Etl:bie
    7. Search Engine: Lucene
    8. Dashboard:openlaszlo
    9. Portal Server:jboss/tomcat/jonas

As you can see from its roadmap, SpagoBI will incorporate more BI features, even features outside of BI.

SpagoBI Architecture:

The

SpagoBI platform is a powerful and complex feature. The
has a good modularity between its components, plugin loads, and looks at its various components:
Report:birtreportdriver,  birtreportengine, Jasperreportdriver, jasperreportengine;
GEO:  geodriver, geoengine (displaying data and queries using a map);
Olap:jpivotdriver, jpivotengine;
qbe& nbsp :  Qbedriver, qbeengine ; 
Data mining:wekadriver ,  wekaengine;
Security:  Exoportalsecurityprovider;
Booklet (booklet): Bookletscomponent:it is a component for booklets generation. Mainly including file upload, workflow, OpenOffice support;
It also has document management, using Apache Jackrabbit, with the search function, with Lucene. is to do cms,portlet,workflow birth, the technology is very strong. The
SpagoBI uses more tools:
Report:  bird ,   jasperreport;
ETL:   octupus  and & nbsp Talend;
OLAP:  mondrian  and   jpivot;
Data mining  : Weka;
portal  : Exoportal;

Its presentation layer also uses AJAX features, and it also uses OpenLaszlo in Dashboard, a framework for generating flash with Java code, and the home page is http://www.openlaszlo.org/. The new version of 4.0 seems to support the generation of DHTML as well, so SpagoBI's dashboard interface is very friendly.

The ETL of SpagoBI is very cow. You can see that the data processing layer underneath it is separated separately.

Pentaho

Pentaho is a workflow-centric BI suite that emphasizes solution-oriented rather than tool components, consolidating multiple open-source projects with the goal of competing with business bi. It includes:

    1. Workflow engine: Shark and JaWE
    2. Database: Firebird RDBMS
    3. Integrated management and development environment: Eclipse
    4. Reporting tools: Eclipse BIRT
    5. ETL Tool: Enhydra/kettle
    6. OLAP Server:mondrian
    7. OLAP Show: JPivot
    8. Data Mining components: Weka
    9. Application Server and Portal Server: JBoss
    10. Single sign-on service and LDAP authentication: Josso
    11. Custom scripting support: Mozilla Rhino JavaScript script processor

The Pentaho is a well-developed BI solution. Pentaho is biased towards BI solutions that combine with business processes, focusing on medium to large enterprise applications.

Pentaho Architecture:

The architecture of Pentaho is very similar to SpagoBI, but Pentaho likes to call his own things solution, the following references Pentaho from whitepaper:

The Pentaho BI platform differs from traditional BI products. It is a process-centric, solution-oriented (solution) framework with Business Intelligence (BI) components that enable companies to develop a complete solution for business intelligence issues Pentaho like the data processing layer is very important to see, a variety of display methods, and even RSS output.

Pentaho is made up of a variety of open source components.

Etl:kettle (shown on the interface is Pentaho Data integration, previously kettle)
Report:pentaho report (It also supports integration of Birt and Jasperreport, as well as specialized documentation)
Olap:mondrian and JPivot (Mondrian has joined the Pentaho)
Platform:pentaho planform
Data Mining:weka (Weka also joined the Pentaho)

Official site
    • Openi http://openi.sourceforge.net
    • Jaspersoft http://www.jaspersoft.com/
    • SpagoBI http://spago.eng.it
    • Pentaho http://www.pentaho.com/

"Reprint" Four kinds of BI open source tools introduction-spagobi,openi,jaspersoft,pentaho

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.