After a painful data warehouse and OLAP basic knowledge learning, in the boss's voice urged, carefully created a data warehouse, design a common analytical OLAP analysis interface.
So the heart Anxi "haha, you can do it." , because it has been implemented in accordance with the functional items listed in the requirements specification. After taking this product to the client side demo, the customer even said
A few "no" (how can it be so, a little face do not give?) At least I took a lot of time from a data warehouse illiterate to making this product. Next, the customer proposed
Several requirements for intelligent analysis of the data, the specific requirements may be as you have experienced or are experiencing the same. Well, let me try to do intelligent analysis.
First, we'll complete the correlation analysis. Yes, I remember. The data mining principle introduces the correlation analysis, after a period of study and experiment, the example of this basket is still not brought to me
How much inspiration? I use as, there are only two data mining algorithms yes: Decision tree and clustering. I'm not quite sure how these two algorithms relate to relational analysis. In the next few days,
I spent my time in finding and reading data mining and correlation analysis (a bit of a waste, but the boss didn't want to pay for a veteran to do the project, no way to touch a stone
River). This day, I finally put down the data mining books, re-open the as Help document, while reading the MDX section, while in the homemade MDX Query Analyzer to do the experiment
。 Suddenly brainwave, why not try to test MDX to implement the analysis? Said to dry, after a test, hey, hehe, really successful. Here are my success steps:
Let's talk about the structure of the Data Warehouse first:
Samplecube
--dim1
----Dim1hier1
------DIM1LEV1
------Dim1lev2
------...
----Dim1hier2
--dim2
----Dim2hier1
-------Dim2lev1
-------Dim2lev2
----Dim2hier2
...
--measures
----SUM1
Next, we define the support degree-the confidence index in the correlation analysis, and use MDX's with clause to implement
To achieve DIM1.DIM1LEV1 and DIM2.DIM2LEV1 Association analysis, the definition is as follows:
With
member [Measures]. [Dim1lev1sup] as ' ([dim1].[ Dim1hier1]. [Dim1lev1]. CurrentMember, [Dim2]. [Dim2hier1]. [All DIM2],
DIM1],[SUM1])/([dim2].[ Dim2hier1]. [All Dim2], [DIM1]. [Dim1hier1]. [All Dim1], [Sum1]) '
member [Measures]. [Confidence] as ' ([dim1].[ Dim1hier1]. [Dim1lev1]. CurrentMember,
[DIM2]. [Dim2hier1]. [Dim2lev1]. CURRENTMEMBER,[SUM1])/([dim1].[ Dim1hier1]. [All Dim1], [Dim2]. [Dim2hier1]. [All Dim2], [Sum1]) '
Then the SELECT statement is used to realize the analysis, the minimum support degree of the Dim1lev1sup is 1%, and the strong Association analysis is achieved (that is, the confidence degree is greater than 1).
Select {[measures].[ sum1],[confidence level], [Measures]. [Dim1hier1sup], [Measures]. [Dim2hier1sup]} On columns,
Order filter ({[dim1].[ Dim1hier1]. [Dim1lev1]. Members * [DIM2]. [Dim2hier1]. [Dim2lev1]. Members},[measures]. [Dim1lev1sup] > 0.05
and [Measures]. [Dim2lev1sup] > 0.01 and ([Confidence]/[measures].[ dim1lev1sup]* [Measures]. [Dim2lev1sup]) > 1)), [Sum1], Bdesc)
On the rows from Samplecube
So the strong association analysis algorithm is realized.
Above for personal experience, if there is a better way to achieve, please reply to the report.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.