Preface
This name is too big. In fact, I just want to describe a design. This design is used to collect and analyze user behavior.
Generally, data is indispensable for analyzing user behavior. The data can come from databases or user operation logs. Here I will introduce the behavior analysis method based on user operation logs. This method can also be said to be a design, which consists of three parts. The first part is the collection of user behavior data. The second part is the collection of user behavior data, and the last is the analysis of user behavior data. The overall structure is roughly as follows:
The figure shows the running process of the entire framework. Next, let me explain it in three parts.
Part 1: Collection of user behavior data
Assume that three business servers A, B, and C record the operation sequence while providing daily services. The operation sequence is similar
2010-07-22 17:11:05.801 AdoService(ExecuteNonQuery1) 1 56 sessiontable154A insertindex [@sessionid,String,Input,6341541562942187505580426882972741][@objid,String,Input,112693246][@ip,String,Input,10.53.139.134]
[Time, execution method, execution time, database configuration, command execution, and execution parameters ]. We obtain user data in a decentralized or pushed manner. We create a large number of clients and configure each client to collect the list of files and file types, the client periodically reads the files in this folder and sends them to the message queue. Here are two points
1: reading those files is a problem. First, the latest file cannot be read because the business server may be writing operation sequences to it, and the last read file cannot be read. How can we tell whether the file has been read? Here we use the File Creation Time to differentiate.
2: Why do I need to send messages to the Message Queue instead of directly sending them to the Data Warehouse? There are two reasons: one is that the data warehouse may become a hot spot, and the other is that if data is flushed into the data warehouse, there is no responsibility for this, after all, there is still a file-to-column correspondence problem.
Part 2: User Data Summary
To put it simply, it is to periodically read the data in the message queue and then fl it into the data warehouse. Here, we need to explain how to map the information in MQ to the columns in the data warehouse.
1: corresponding information,
[AttributeUsage (AttributeTargets. Field)]
Public class UserActionAttribute: Attribute
{
/// <Summary>
/// Description
/// </Summary>
Public string Describe {get; set ;}
/// <Summary>
/// Serial number of the data in MQ
/// </Summary>
Public int InMQIndex {get; set ;}
/// <Summary>
/// Column name
/// </Summary>
Public string Column {get; set ;}
/// <Summary>
/// Field Type
/// </Summary>
Public string DBType {get; set ;}
Public UserActionAttribute (int inMQIndex, string column, string dbType, string describe)
{
Column = column;
Describe = describe;
InMQIndex = inMQIndex;
DBType = dbType;
}
}
It is actually the serial number, column name, type in the message queue.
2: How to correspond to and brush data. We allow users to define their own entity EG:
Public class UserActionEntity
{
[UserActionAttribute (0, "ServerName", "System. String", "server name")]
Public String ServerName;
[UserActionAttribute (1, "DateTime", "System. String", "Information occurrence time")]
Public string DateTime;
[UserActionAttribute (2, "MethodName", "System. String", "Name of the method to be executed")]
Public string MethodName;
[UserActionAttribute (3, "IsSuccess", "System. String", "whether execution is successful")]
Public string IsSuccess;
[UserActionAttribute (4, "ConsumeSec", "System. String", "execution time consumed")]
Public string ConsumeSec;
[UserActionAttribute (5, "DBName", "System. String", "Database")]
Public string DBName;
[UserActionAttribute (6, "StoredProcedureName", "System. String", "method name")]
Public string StoredProcedureName;
[UserActionAttribute (7, "Params", "System. String", "executed parameter")]
Public string Params;
}
Here we can use SqlBulkCopy to refresh data from and to the database. EG
Assembly classSampleAssembly = Assembly.LoadFrom(EntityDllPath);
Type classSampleType = classSampleAssembly.GetType(EntityDllFullName);
object[] attribs = null;
foreach (FieldInfo prop in classSampleType.GetFields())
{
attribs = prop.GetCustomAttributes(typeof(UserActionAttribute), false);
foreach (UserActionAttribute fieldAttrib in attribs)
{
DBSchema schema = new DBSchema(fieldAttrib.InMQIndex, fieldAttrib.Column, fieldAttrib.DBType);
list.Add(fieldAttrib.Column, schema);
}
}
Part 3: user behavior analysis
It is probably to create a multi-dimensional data model and use the BI formula for analysis, and finally obtain the result report. This part is currently under development, and I will add it later.
Postscript
I hope this idea will be useful to everyone.