Commodity recommendation using association rules of SQL Server Analysis Services data mining (5)

Source: Internet
Author: User

If you have a shopping website, how do you recommend products to your customers? This function is available on many e-commerce websites. You can easily build similar functions through the data mining feature of SQL Server Analysis Services.

 

 

This article mainly demonstrates how to organize data according to the requirements of tools, and then perform mining, prediction, and analysis in Excel.

 

 

In the previous article, we demonstrated how to use the sample data in Data Mining for add-ins provided by Microsoft to implement the function of mining, prediction, and analysis, this article uses adventure works to demonstrate how to organize data for mining and prediction in Excel.

The examples of adventure works provided by Microsoft are divided into OLTP and OLAP databases. The former records business data and the business data in a system is usually operated at this layer, the business data layer enters the Data Warehouse through ETL cleaning, loading, and conversion for the analysis system to perform statistical analysis.

For how to obtain and install these sample libraries, refer to my article.

In the adventure works example provided by Microsoft, the Data Warehouse Library has been created for us. In version 2012, the default name is adventureworksdw2012.

In the first demonstration, we used a view: vassocseqlineitems.

This view records the order number, product name, and other information. Through this view, another view is nested, And we generate an Association Rule Mining Model for Prediction and Analysis.

In the previous article, the mining tool also needs to provide product price information. Obviously, this view cannot meet the requirements of the Tool. Therefore, we need to redefine the sample data according to the structure of the data warehouse.

In the settings interface we mentioned in the previous article, three pieces of information need to be specified,Order No.,Product NameAndProduct Price. The tool generates a temporary Mining Model Based on the information in these three columns and then generates a prediction report.

In the data warehouse, select two tables, the fact table factinternetsales and the dimension table dimproduct. In the fact table, we take the order number and connect to the dimension table to obtain the product name and price information. Create the following query:

Select

F. [salesordernumber] as [ordernumber]

, Coalesce (P. [modelname], p. [englishproductname]) as [model]

, F. extendedamount as amount

From [DBO]. [factinternetsales] F

Inner join [DBO]. [dimproduct] P on F. [productkey] = P. [productkey]

This query connects the fact table and dimension table to generate data in the format required by the tool, that is, the order number, product name, and price.

Next, you need to import the query results of the preceding SQL statement in Excel. The following operations demonstrate how to import the results of an SQL statement to an Excel file. Of course, there are many import methods in Excel. Here is only one of them:

Open excel, click data under the ribbon menu, from other sources-> from SQL Server.

In the new data connection tool, specify the server where the analysis server is located.

Click Next.

Pay attention to connect to specific table. Because we use a query instead of an existing table to live a view, we need to check it out. Select the adventureworksdw2012 Data Warehouse and click Next.

Confirm the connection file and click Finish.

In the following interface, you can select a table in the specified database. Click OK.

On the import data page that appears, select Properties to display the connection property page:

On the connection properties page, click definition, change command type to SQL, enter the SQL query you just created in command text, and click OK.

As shown in the preceding figure, sample data is generated. Next, perform the shopping basket Prediction and Analysis Based on the method described in the previous article.

For more information about the specific method, see the previous article. Note that for the transation ID, select ordernumber, item select model, and item value select amount.

On the two tabs created by the tool, you can see the mining report generated based on the organization's data.

In addition, you can click Advanced to set some mining parameters during the mining and analysis process:

On the displayed page, you can specify the minimum support and minimum rule probability.

The setting of these parameters affects the prediction result set. Through the settings here, you can filter out some events with low support and low probability of occurrence, and dynamically adjust these values according to the needs of different business scenarios to achieve the mining results that meet our requirements.

Any mining tool is just a tool. It will let us set some algorithm-related parameters. There are no technical suggestions for setting these parameters, certain industry experience is required for how to set it. That's why data analysts with industry experience are well treated. At the same time, it indirectly explains the fact that an industry has been circulating for a long time, that is, why top tools, such as Excel, make more money than simply writing SQL statements.

 

 

Summary:

This demonstration is based on the previous one. It organizes data from the sample data DW provided by Microsoft, and then imports the data to excel to implement mining, prediction, and analysis using the data mining plug-in.

So far, the introduction to association rule mining and analysis in Excel has come to an end. Compared with the analysis service methods in the previous three articles, the operations in Excel are more direct. The former is more suitable for use in the IT system and suitable for more data, such as shopping basket recommendations on websites, the latter is more suitable for small-scale internal mining, prediction, and analysis.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.