User Behavior Analysis Framework

Source: Internet
Author: User

Preface

This name is too big. In fact, I just want to describe a design. This design is used to collect and analyze user behavior.

Generally, data is indispensable for analyzing user behavior. The data can come from databases or user operation logs. Here I will introduce the behavior analysis method based on user operation logs. This method can also be said to be a design, which consists of three parts. The first part is the collection of user behavior data. The second part is the collection of user behavior data, and the last is the analysis of user behavior data. The overall structure is roughly as follows:

The figure shows the running process of the entire framework. Next, let me explain it in three parts.

Part 1: Collection of user behavior data

Assume that three business servers A, B, and C record the operation sequence while providing daily services. The operation sequence is similar

 2010-07-22 17:11:05.801        AdoService(ExecuteNonQuery1)    1    56   sessiontable154A    insertindex    [@sessionid,String,Input,6341541562942187505580426882972741][@objid,String,Input,112693246][@ip,String,Input,10.53.139.134]

[Time, execution method, execution time, database configuration, command execution, and execution parameters ]. We obtain user data in a decentralized or pushed manner. We create a large number of clients and configure each client to collect the list of files and file types, the client periodically reads the files in this folder and sends them to the message queue. Here are two points

1: reading those files is a problem. First, the latest file cannot be read because the business server may be writing operation sequences to it, and the last read file cannot be read. How can we tell whether the file has been read? Here we use the File Creation Time to differentiate.

2: Why do I need to send messages to the Message Queue instead of directly sending them to the Data Warehouse? There are two reasons: one is that the data warehouse may become a hot spot, and the other is that if data is flushed into the data warehouse, there is no responsibility for this, after all, there is still a file-to-column correspondence problem.

Part 2: User Data Summary

To put it simply, it is to periodically read the data in the message queue and then fl it into the data warehouse. Here, we need to explain how to map the information in MQ to the columns in the data warehouse.

1: corresponding information,

 [AttributeUsage (AttributeTargets. Field)]
Public class UserActionAttribute: Attribute
{
/// <Summary>
/// Description
/// </Summary>
Public string Describe {get; set ;}

/// <Summary>
/// Serial number of the data in MQ
/// </Summary>
Public int InMQIndex {get; set ;}

/// <Summary>
/// Column name
/// </Summary>
Public string Column {get; set ;}

/// <Summary>
/// Field Type
/// </Summary>
Public string DBType {get; set ;}

Public UserActionAttribute (int inMQIndex, string column, string dbType, string describe)
{
Column = column;
Describe = describe;
InMQIndex = inMQIndex;
DBType = dbType;
}
}

It is actually the serial number, column name, type in the message queue.

2: How to correspond to and brush data. We allow users to define their own entity EG:

Public class UserActionEntity
{
[UserActionAttribute (0, "ServerName", "System. String", "server name")]
Public String ServerName;

[UserActionAttribute (1, "DateTime", "System. String", "Information occurrence time")]
Public string DateTime;

[UserActionAttribute (2, "MethodName", "System. String", "Name of the method to be executed")]
Public string MethodName;

[UserActionAttribute (3, "IsSuccess", "System. String", "whether execution is successful")]
Public string IsSuccess;

[UserActionAttribute (4, "ConsumeSec", "System. String", "execution time consumed")]
Public string ConsumeSec;

[UserActionAttribute (5, "DBName", "System. String", "Database")]
Public string DBName;

[UserActionAttribute (6, "StoredProcedureName", "System. String", "method name")]
Public string StoredProcedureName;

[UserActionAttribute (7, "Params", "System. String", "executed parameter")]
Public string Params;
}

Here we can use SqlBulkCopy to refresh data from and to the database. EG

            Assembly classSampleAssembly = Assembly.LoadFrom(EntityDllPath);
Type classSampleType = classSampleAssembly.GetType(EntityDllFullName);

object[] attribs = null;
foreach (FieldInfo prop in classSampleType.GetFields())
{
attribs = prop.GetCustomAttributes(typeof(UserActionAttribute), false);
foreach (UserActionAttribute fieldAttrib in attribs)
{
DBSchema schema = new DBSchema(fieldAttrib.InMQIndex, fieldAttrib.Column, fieldAttrib.DBType);
list.Add(fieldAttrib.Column, schema);
}
}

Part 3: user behavior analysis

 

It is probably to create a multi-dimensional data model and use the BI formula for analysis, and finally obtain the result report. This part is currently under development, and I will add it later.

Postscript

I hope this idea will be useful to everyone.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.