The distribution of data in multidimensional space is always sparse and uneven. At the location where the event occurs, the data is aggregated and the density is large. Therefore, developers of OLAP systems should try to solve the problem of data sparsity and data aggregation in multidimensional data space. In fact, there are many ways to construct multidimensional data.
1. Super cubic structure
A hypercube structure (hypercube) refers to a three-dimensional or more dimensional dimension to describe an object, each of which is perpendicular to each other. The measured values of the data occur at the intersection of the dimensions, and each part of the data space has the same dimension attribute.
This structure can be used in multidimensional database and relational database-oriented OLAP system, which is mainly characterized by simplifying the operation of end users.
The super cubic structure has a kind of deformation, that is to shrink the cubic structure. This structure has more data density, fewer dimensions of data, and can add additional analytic dimensions.
2. Multi-cubic structure
In the multi-cubic structure (multicube), the large data structure is divided into multiple multidimensional structures. These multidimensional structures are subsets of large data dimensions, which are segmented to a particular application, and the hypercube structure becomes a sub cubic structure. It has the flexibility to improve the analysis efficiency of data (especially sparse data).
In general, multi-cubic structure is more flexible, but the super cubic structure is easier to understand. End users are more likely to approach the hyper-cubic structure, which provides a high level of reporting and multidimensional views. However, MIS experts with Multidimensional Analysis experience prefer the multi cubic structure, because it has good view flip and flexibility. Multi-cubic structures are a more efficient way to store sparse matrices, and can reduce computational load. Therefore, complex systems and pre-established general-purpose applications tend to use a multi-cubic structure, so that the data structure can be better adjusted to meet the common application needs.
Many products combine the above two structures, their data physical structure is a cubic structure, but the use of hypercube structure to calculate, combined with the super cubic structure of the simplification and multi-cubic structure of the rotating storage characteristics.
3. Storage of active data
The data that a user extracts from an application is called active data, and it is stored in the following three ways:
(1) Relational database
If the data originates from a relational database, the active data is stored in the relational database. In most cases, data is stored in either a star structure or a snowflake structure.
(2) Multidimensional database
In this case, the active data is stored in a multidimensional database on the server, including data from relational databases and end users. Typically, the database is stored on the hard disk. Some data are calculated in advance, and the results are stored in an array form.
(3) Based on customer's documents
In this case, a relatively small amount of data can be extracted on the client's file. This data can be built in advance, such as Web files. As with multidimensional databases on the server, the active data can be placed on disk or RAM.
These three types of storage have different performance, in which the relational database processing speed is much lower than the other two kinds.
4. How OLAP data is processed
OLAP has three methods of data processing. In fact, multidimensional data calculations do not need to be performed at the data storage location.
(1) Relational database
Even if the active OLAP data is stored in a relational database, it is not a good choice to perform complex multidimensional computations on a relational database. Because SQL's single statement does not have the ability to perform multidimensional computations, multiple SQL is required to obtain even the most common multidimensional computing capabilities. In many cases, some OLAP tools do some calculations with SQL and then use the results as multidimensional engine input. The multidimensional engine does most of the computing work on the client or middle-tier server, so that it can use RAM to store the data and improve the response speed.
(2) Multidimensional service engine
Most OLAP applications perform multidimensional computations on the multidimensional services engine and have good performance. Because this approach optimizes both the engine and the database, the full memory on the server guarantees the efficient computation of a large number of arrays.
(3) Client
Computing on the client computer requires the user to have a well performing PC to complete some or most of the multidimensional calculations. For an increasing number of thin clients, OLAP products move client-based processing to new Web application servers.