In many analysis models that do not require real-time data but have massive data volumes or require sufficient flexibility, SSAs has great advantages over traditional SQL, such as performance and user customization. The performance advantage is reflected in the fact that MDX statements are more customizable than SQL Aggregate functions with larger data volumes. In this case, developing a display tool suitable for various users to freely analyze statistical data is much less costly than using SQL statements.
Mdx is a multidimensional expression designed for data analysis.
If your project is suitable for the above features, you really need to consider over-going to SSAs for OLAP. Next we will use an example to show a simple application of Alibaba Cloud, which can help people who have never been in touch with it.
Today, I saw such a requirement on the Forum, that is, to analyze web logs, such as PV, and compare them by date. Generally, this type of source data volume is very large. If the case when of an SQL statement is used in combination with an aggregate function, complicated row-column conversion and pivot statements are difficult to run, an SSAs model can easily solve the problem.
Let's create a test database environment.
Stime indicates the access start time, sleavetime indicates the last access time, And scount indicates the number of pages accessed by this IP address.
-- Create a test environment
Create Database testssas
Go
Use testssas
Go
-- Fact table
Create Table logs (SID varchar (20), swebsiteid varchar (20), stime datetime, sleavetime datetime, SIP varchar (20), scount INT)
Insert into logs select '1', '123', '2017-11-18 09:18:35. 000', '2017-11-18 14:51:29. 000 ', '61. 183.248.218 ', '87'
Insert into logs select '2', '123', '2017-11-18 09:38:36. 000', '2017-11-18 17:04:23. 000 ', '61. 144.207.115 ', '123'
Insert into logs select '3', '123', '2017-11-18 09:42:35. 000', '2017-11-18 10:36:46. 000 ', '61. 183.248.218 ', '5'
Insert into logs select '4', '123', '2017-11-18 16:45:19. 000', '2017-11-18 16:45:21. 000 ', '61. 144.207.115 ', '4'
Insert into logs select '5', '123', '2017-11-18 16:45:54. 000', '2017-11-18 16:45:55. 000 ', '61. 144.207.115 ', '5'
Insert into logs select '7', '123', '2017-11-18 16:46:58. 000', '2017-11-18 16:46:59. 000 ', '61. 144.207.115 ', '3'
Insert into logs select '8', '123', '2017-11-18 16:47:15. 000', '2017-11-18 16:47:16. 000 ', '61. 144.207.115 ', '4'
Insert into logs select '5', '123', '2017-11-19 16:45:54. 000 ', '2017-11-19 16:45:55. 000 ', '61. 144.207.115 ', '15'
Insert into logs select '7', '123', '2017-11-19 16:46:58. 000 ', '2017-11-19 16:46:59. 000 ', '61. 144.207.115 ', '13'
Insert into logs select '8', '123', '2017-11-19 16:47:15. 000 ', '2017-11-19 16:47:16. 000 ', '61. 144.207.115 ', '14'
Go
-- View of fact table
Create view v_fac_logs
Select Sid, swebsiteid, convert (varchar (10), stime, 120) as date, sip, scount from logs
Go
-- Dimension table
Create Table dim_datetime (date varchar (10 ))
Insert dim_datetime
Select '2014-11-15 'Union
Select '2014-11-16 'Union
Select '2014-11-17 'Union
Select '2014-11-18 'Union
Select '2017-11-19'
Go
-- Fact table extraction dimension, which is implemented by view
Create view dim_ip
Select distinct sip from logs
For the preceding logs table, the test data is used. Like the normal SQL environment, the following facts and dimensions are used to prepare the SSAS model, the group by field is extracted and exists as a separate dimension table. They are in the master relationship with the fact table (v_fac_logs view here.
Then we use some pure UI efforts to generate an SSAs multi-dimensional dataset.
1. enable SQL server business intelligence development studio that comes with sqlserver2005, or you can use vss2005 on your machine.
2. Create a project and select the Analysis Service Project in the Business Intelligence template.
3. Establish a database connection. This is very simple. Just connect to the test database.
However, pay attention to the details here. In the configuration link window, there is a simulated information. You need to change the logon method to "service account"
4. Create a data source view and select the fact table and dimension table. Note that v_fac_logs is selected as the fact table instead of logs.
A logical relationship between a dimension and a fact is a primary key and a foreign key. A fact table must have a logical primary key. This does not need to be set in the real environment of sqlserver, as long as it is set here.
5. Create a multi-dimensional dataset by default in accordance with the Wizard.
The middle dimension and fact structure are automatically handled by the system. You can ignore them when you are not clear about their specific usage. The final effect is as follows:
6. configure a role that will be used to log on to the SSAS server for authentication later. The system administrator is used here.
7. After a simple model is created, deployment and data processing are performed. By default, the model is deployed on the local host of your server.
8. During Processing, you can change the settings and ignore errors. Ssas may fail due to logical or data errors. If errors are ignored, only a single record (CONTINUE) is skipped. Otherwise, the entire process will exit (break) and then click Run.
Ignore Error Methods
The result of successful processing.
9. Now an available multi-dimensional dataset is ready. We can use our modeling tools to browse the data. Choose cube-Browser:
You only need to drag dim-related dimensions to the corresponding dimension area (where rows and columns are all dimensions), and then drag the Measurement corresponding to measures to the data area to view the data.
A simple multi-dimensional dataset has been built. Here, it is just a matter of sensibility. A careful friend may see that his column and column conversion is very convenient, you only need to check the datetime position of the IP address.
The reason is that it is simple. First, there are few test data and no complicated ETL process is required. There are only two available dimensions, and only one statistical measurement actually exists, which does not reflect any superiority. However, if you actually use a project, you will find it powerful.
This article is just the first step for us to transition to the Bi project of SSAs, a simple model. Next we will discuss how to deploy the model to the Web server for other clients, for example, SQL Server Management studio ,.. Net client access, and how to use MDX statements for analysis and statistics. If a large project, such as a complex data source to be analyzed, how to integrate resources and clean ETL data.