Bi notes-a solution for cube incremental Processing

Source: Internet
Author: User

Incremental processing Overview:

In general, the cube will grow with the accumulation of time, so that if all data is processed each time, the processing time will gradually become longer. Therefore, for large-scale cubes, incremental processing solutions are adopted in pursuit of processing time and performance.

Scenario:

According to the data structure of the data warehouse and the meaning of the service, the data organization may be different. Here we discuss one of the scenarios of incremental processing, where:

A fact table uses the date key as the primary key (data generation date) and is created as a month partition based on the date.

In addition, the fact data does not change and only increases with the growth of the date. The processing cycle of the cube is day, and the partition of the cube is month-based.

This scenario usually occurs when analyzing status data, such as IIS day analysis.

Problem:

For the general process of incremental processing, refer to my note.

There is a cube Process Task package in SSIs. In this case, when specifying the incremental processing, you can only write the target partition to death, rather than dynamically determining the partition, for example:

In this way, you need to use the script code.

Solution:

As shown above, you can use the script code to implement it. There are many methods to implement the script module on the Internet, all of which are in the vb version. Here, powershell is used for simplicity and clarity. If you want to write it as a C # version, you can directly refer to the powershell mode.

Powershell allows you to directly use. NET Framework objects, while operations on cubes require a series of AMO objects. We recommend that you first refer to AMO for oak. The idea of writing code can be clearer only after you have an understanding of the oak tree.

Steps:

1. Name all partitions, such as [measuregroupname] _ [year]-[month].

2. In the code, first find the partition corresponding to yesterday (based on the partition naming Rules)

3. Determine the SQL query corresponding to yesterday. This query will be used as the incremental query statement for the partition just found in this cube. The incremental query statement is determined by the partition query statement.

4. Processing

Code:

# Location of the DLL required for the AMO object. This is the same as adding a reference in vs. The 64-bit system address may change.

[Void] [reflection. Assembly]: LoadFile ("C: \ Program Files \ Microsoft SQL Server \ 100 \ SDK \ assemblies \ microsoft. analysisservices. dll ")

# Assume that none of the partitions in adventure works are named by the [metric value group name] _ [year]-[month. Of course, in practice, each measurement value group of adventure works has only one partition, and this is not the case.

$ Servername = "localhost"

$ Databasename = "Adventure works DW 2008r2"

$ Cubename = "Adventure works"

# Create a Server Object. C # is written as server _ S = new server ();

$ Server = new-object Microsoft. analysisservices. Server

$ Server. Connect ($ servername)

$ Db = $ server. Databases. getbyname ($ databasename)

$ Cu = $ db. cubes. getbyname ($ cubename)

$ T = [system. datetime]: Now. tostring ("yyyy-mm-dd hh: mm ")

Write-output "[$ T] [$ cubename] processing dimensions ..."

######################################## ####

# Process Dimension first

######################################## ####

$ Server. capturexml = $ true

Foreach ($ DM in $ cu. dimensions)

{

# Note the usage of enumeration types in powershell

$ DM. dimension. Process ([microsoft. analysisservices. processtype]: processupdate)

}

$ Server. capturexml = $ false

$ R = $ server. executecapturelog ($ true, $ true)

# The processing result is displayed here. In fact, it is enumerated only when an error occurs.

Foreach ($ XR in $ R)

{

Foreach ($ XM in $ XR. Messages)

{

Write-output $ XM. Description

}

}

$ T = [system. datetime]: Now. tostring ("yyyy-mm-dd hh: mm ")

Write-output "[$ T] [$ cubename] dimension process finished! "

# $ Server. capturelog. Clear ()

######################################## ####

# Processing cube

######################################## ####

# Write-output "[$ cubename] processing cube ..."

$ Server. capturexml = $ true

Foreach ($ mg in $ cu. measuregroups)

{

$ Server. capturelog. Clear ()

$ Server. capturexml = $ true

#1. Locate the current partition. Here, we use yesterday as the scenario convention.

$ Nowpartition = ""

$ Nowpartition = $ mg. Name + "" + [system. datetime]: Now. adddays (-1). tostring ("yyyy-mm ")

$ T = [system. datetime]: Now. tostring ("yyyy-mm-dd hh: mm ")

Write-output "[$ T] [$ cubename] process partition for update $ processdate: $ nowpartition"

$ Pt = $ mg. partitions. getbyname ($ nowpartition)

#2. generate an incremental Query

# $ Pt_date_start = [system. datetime]: Now. adddays (-1 ). year. tostring () + "-" + [system. datetime]: Now. adddays (-1 ). month. tostring ("D2") + "-0"

# $ TDT = [system. datetime]: parse ($ pt_date_start)

$ Pt_date_end = [system. datetime]: Now. adddays (-1). tostring ("yyyy-mm-dd ")

$ SQL = $ pt. Source. querydefinition

$ SQL = $ SQL. substring (0, $ SQL. lastindexof ("between") + 8) + "'" + $ pt_date_end + "00:00:00. 000 'and' "+ $ pt_date_end +" 23:00:00. 000 '"

$ QB = new-object Microsoft. analysisservices. querybinding

$ QB. performanceid = "noas"

$ QB. querydefinition = $ SQL

$ Pt. Process ([microsoft. analysisservices. processtype]: processadd, $ QB)

$ Server. capturexml = $ false

$ R = $ server. executecapturelog ($ true, $ true)

Foreach ($ XR in $ R)

{

Foreach ($ XM in $ XR. Messages)

{

Write-output $ XM. Description

}

}

}

$ T = [system. datetime]: Now. tostring ("yyyy-mm-dd hh: mm ")

Write-output "[$ T] [$ cubename] process cubes OK! "

Note:

Pay attention to the partition naming rules, because it is used when the code finds the partition where the incremental data is needed.

Because it is a task completed by using script code, Bi personnel need to know a little bit about code writing, especially understanding of. NET Framework.

For how to use the AMO object to process an incremental update, refer to the sample code above.

Summary:

Here, we briefly describe a scenario of incremental processing and update. In actual projects, the incremental processing scheme of cube is very complicated, so we need to analyze the specific situation. However, this method is generally implemented using powershell in the code, and powershell is also flexible in use. Therefore, we recommend that you use it as a friend of Bi.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.