Several Processing Methods of slow gradient dimension

Source: Internet
Author: User

Slow gradient dimension, that is, attributes in the dimension may change over time. For example, in the dimcustomer dimension that contains the user's Address, the user's address may change, which affects the service statistics accuracy, the dimcustomer dimension is the slow gradient dimension (SCD). For scd, there are usually the following processing methods:

    • Type 1: historical changes are not recorded at all. When ETL loads data into scd, it directly overwrites the changed attribute values. For example, for dimcustomer's address, each time a new address is updated to this field, the SCD is always the latest current information, but does not contain historical information.
    • Type 2: Add a record to record each change to SCD. Each record has two fields (such as inclutive_start and inclutive_end), indicating the validity period of the record, in addition, you can set an active flag field. If this field is set to true, it indicates that the record is in the latest state. If it is set to false, it indicates that the record is a historical record, during the validity period, you can use the inclutive_start and inclutive_end fields to query
    • Type 3: Add corresponding historical fields for fields that will change to record recent changes instead of all changes. For example, dimcustomer has two fields: Address and address_old. The first field is the current address of the user, and the last field is the previous address of the user. Obviously, the previous information cannot be traced.
    • Type 4: apart from a dimension that records the current information, a historical information dimension is created separately. This dimension must contain valid period fields (such as inclutive_start and inclutive_end)
    • Type 6 = 1 + 2 + 3: As you can see, for type 1/2/3, it is the processing method for the gradient attribute in SCD, and for a complex SCD that contains multiple fields, you may need to combine the preceding three processing methods. For example, for the user contact method attribute email in dimcustomer, if the business is not important, this field can be Type 1, that is, only the latest contact method is retained at a time to overwrite the original one; if you need to analyze the region where the user is located, you may need to use type 2 to record the changes of each region. For address information address, you may not need to trace the changes for a long time, then it is enough to add an address_old field to record the last address.

Several Processing Methods of slow gradient dimension

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.