Mondrian is an open-source project. An OLAP (Online Analytical Processing) engine written in Java. It implements queries using the MDX language and reads data from relational databases (RDBMS. The results are displayed in multiple dimensions through Java APIs.
We all know that Online Analytical Processing (OLAP) needs to analyze a large amount of data in real time. The term "online" means that the system must respond quickly enough even if the related data volume is huge, in the unit of GB.
OLAP uses multidimensional analysis technology. Although all data stored in a relational database exists in the form of rows and columns, A Multidimensional Dataset can still be composed of an axes and a cell.
Figure 1
In the preceding example, time is a dimension, and the hierarchies below it is divided into six months and quarterly levels ).
Ii. perceptual knowledge-run a small instance
For the moment, we will not discuss some of the technologies used by Mondrian. We can start with a small example. We don't have to worry about every detail, but hope you will have a perceptual knowledge first. In the process of implementing this example, readers may be able to see some clues.
First, go to http://sourceforge.net/projects/mondrian/to download the latest mondrian. This is a zip package, including the Lib we will use and an example.
Of course, readers can fully run Mondrian's own instances, but this example is complicated. Besides Mondrian itself, there are other technologies in it, which are not easy to understand and are not suitable for beginners to learn. Therefore, the author designs a simple and clean minimal instance here.
The instance environment described in this article is Windows2000 + Tomcat + oracle. The reader thinks that the correct development environment has been set for JDK and tomcat.
2.1 preparations
In % Tomcat % "webapp", create mywebapp, mywebapp "WEB-INF, mywebapp" WEB-INF "Lib
Mondrian. war "WEB-INF" lib "Mondrian. jar, javacup. jar, xalan. jar, JUnit. jar and other related jar packages are copied to % atat_home % "webapps" mywebapp "lib.
2.2 database structure
In this tiny system, the database has three tables tb_employee (employee table), tb_time (time table), and tb_salary (salary table ). The table structure is as follows:
Oracle
Drop table tb_employee;
Create Table tb_employee (employee_id number, -- employee ID
Employee_name varchar2 (10) -- employee name );
Drop table tb_time;
Create Table tb_time (time_id number, -- time ID
The_year char (4), -- year
The_month char (2) -- month );
Drop table tb_salary;
Create Table tb_salary (employee_id number, -- employee ID
Time_id number, -- time ID
Salary number (1)
);
Ms SQL 2000
Create Table tb_employee (employee_id bigint, -- employee ID
Employee_name varchar (10) -- employee name
)
Create Table tb_time (time_id bigint, -- time ID
The_year varchar (4), -- year
The_month varchar (2) -- month
)
Create Table tb_salary (employee_id bigint, -- employee ID
Time_id int, -- Time
Idsalary smallint
)
Of course, to make the system run, you also need to insert some data into the database table.
2.3 write the schema file based on the structure of the database table
<? XML version = "1.0"?>
<Schema name = "Mondrian">
<Cube name = "cubetest">
<Table name = "tb_salary"/>
<Dimension name = "employee" foreignkey = "employee_id">
<Hierarchy hasall = "true" primarykey = "employee_id">
<Table name = "tb_employee"/>
<Level name = "employeeid" column = "employee_id" uniquemembers = "true">
<Property name = "employeename" column = "employee_name"/>
</Level>
</Hierarchy>
</Dimension>
<Dimension name = "time" foreignkey = "time_id">
<Hierarchy hasall = "false" primarykey = "time_id">
<Table name = "tb_time"/>
<Level name = "year" column = "the_year" uniquemembers = "false"/>
<Level name = "month" column = "the_month" uniquemembers = "false"/>
</Hierarchy>
</Dimension>
<Measure name = "salary" column = "salary" aggregator = "sum"/>
</Cube>
</Schema>
The file path is mywebapp "WEB-INF" mondriantest. xml
2.4 query using mdx
Mywebapp "mondriantest. jsp
1 <% @ page import = "Mondrian. OLAP. *" %>
<%
2 connection =
Drivermanager. getconnection
("Provider = Mondrian;
JDBC = JDBC: oracle: thin: @ XXX. XXX: 1521: dbname;
Jdbcuser = dbuser;
Jdbcpassword = dbpasswd;
Catalog = file: // C:/tomcat4.1/webapps/mywebapp/WEB-INF/mondriantest. xml;
Jdbcdrivers = oracle. JDBC. Driver. oracledriver; ", null, false );
3 string querystr =
"Select {[measures]. [salary]} on columns,
{[Employee]. [employee ID]. Members} on rows
From cubetest ";
4 query = connection. parsequery (querystr );
5 result = connection.exe cute (query );
Out. println ("Get result ");
%>
Start Tomcat and enter
Http: // localhost: 8080/mywebapp/mondriantest. jsp.
Iii. In-depth discussion
3.1 API
Mondrian provides an API interface for the client application for query. These APIs may seem familiar to anyone who has used JDBC. The main difference is that the query language is different: Mondrian uses MDX ('multi-dimen1_expression'), while JDBC uses SQL.
Like JDBC, you also need to establish a connection to form a query statement, execute the query to obtain the result set, and perform other steps.
Let's take a look at the mondriantest. JSP code.
Row 3: Import Mondrian. OLAP .*
This is the class we need to introduce. The drivermanager, connection, query, and result used below are all in this package. This package is generally located in Mondrian. jar.
Row 2nd: Connection connection =
Drivermanager. getconnection
("Provider = Mondrian;
JDBC = JDBC: oracle: thin: @ XXX. XXX: 1521: dbname;
Jdbcuser = dbuser;
Jdbcpassword = dbpasswd;
Catalog = file: // C:/tomcat4.1/webapps/mywebapp/WEB-INF/mondriantest. xml;
Jdbcdrivers = oracle. JDBC. Driver. oracledriver; ", null, false );
Use drivermanager to create a connection instance and establish a database connection.
JDBC = JDBC: oracle: thin: @ XXX. XXX: 1521: dbname; specifies the database IP address and database name. Jdbcuser = dbuser; set the database user. Jdbcpassword = dbpasswd; set the user password. While
Catalog = file: // C:/tomcat4.1/webapps/mywebapp/WEB-INF/mondriantest. xml; is to set the path of the schema file to which the MDX statement queries
Row 3: String querystr =
"Select {[measures]. [salary]} on columns,
{[Employee]. [employee ID]. Members} on rows
From cubetest ";
Create an MDX query statement. The form of the MDX statement is closely related to the setting of the schema file. Of course, the formation of the schema file is also determined by the database structure.
Row 4th: Query query = connection. parsequery (querystr); analyzes and processes the MDX statement to determine whether it complies with the schema file definition, database structure, and database data.
Row 5th: Result result = connection.exe cute (query); run the query to obtain the result set.
We found that query is similar to JDBC Statement, while result is similar to resultset.
3.2 Schema
3.2.1 what is Schema
Schema defines a multi-dimensional database. A logical model is provided to write query statements in the MDX language. This logical model actually provides the following concepts: cubes, dimensions, hierarchies, levels, and members ).
The schema file is an XML file for editing the schema. Form the logical model and the physical model of the database in this file.
3.2.2 logical structure of Schema
3.2.2.1 cube
A cube is a collection of dimensions and measurements. In the cube, where dimension and measure are located
To write query statements in the MDX language. This logical model actually provides the following concepts: cubes, dimensions, hierarchies, levels, and members ). The schema file is an XML file for editing the schema. Form the logical model and the physical model of the database in this file.
3.2.2 logical structure of Schema
3.2.2.1 a cube is a collection area of dimensions and measurements. In cube, dimension and measure share a fact table. Example: <cube name = "cubetest"> <Table name = "tb_salary"/> ....... </Cube> <Table> label determines the name of the fact table used.
3.2.2.2 a measurement, simply put, is the value to be calculated. For example, the <measure name = "salary" column = "salary" aggregator = "sum"/> <measure> tag has three required attribute names (metric names ), column (field name in fact table), aggregation (Method Used for aggregation ).
3.2.2.3 dimenesion while dimensions generally have corresponding dimension tables. Example: <dimension name = "time" foreignkey = "time_id">
3.3 MDX is used to query multi-dimensional data, while SQL is used to query relational databases. Mondrian's MDX concepts, MDX syntax, and system-defined MDX functions are very similar to Microsoft's MDX, with minor differences. You can refer to Microsoft's help documentation for learning. I will not repeat it here. The MDX language has basically become the standard for multi-dimensional database query languages. Iv. Conclusion Mondrian is a Java-based OLAP engine and an open-source project, A feasible solution is provided for projects that are based on Java and have to be separated from the Microsoft architecture but have to analyze a large amount of data. It is hoped that it will also play a certain role in the development of abstract art, just as that of Montessori.