Quantitative analysis of user's product preference by analytic hierarchy process
Users of the product has a lot of behavior, how to conduct user behavior analysis to quantify the user's preference to the product? such as Watercress FM, users can click "preferences" and "throw into the dustbin", such as Youku video, users can top, step, share and so on. How do we analyze the user's preference for this song through these behavioral messages, and what is the score for this video? The following example analyzes how much the user likes the video. Speaking of video, we can think of a variety of user behavior, see how long, whether comments, is the top is stepping, whether to share?
We can use these metrics to evaluate the user's rating of the video. such as how long the user to see how many points, share the number of points and so on. There are also different behaviors that reflect different degrees of preference. We can evaluate the score by a simple company, score=w1*x1+w2*x2.......x1,x2 and so on is the behavior index, W1,W2 and so on is the behavior weight. Behavioral indicators: For example, the user points to praise, give 1 points, share, give 2 points. Normalization is usually needed here to compress fractions into a reasonable range. (PS: This is equivalent to the field experts scoring, do not know whether the method of modeling, to be explored). Behavioral weights: Different behaviors, reflecting the user's different preferences, such as sharing a more important than the top likes. Faced with many indicators, how to reasonably determine the weight of it? The weights of each behavior index are determined by analytic hierarchy process.
The comparison matrix is caused by the structure
|
Playback duration |
Playback Duration/ Video duration |
Comments |
Download |
Collection |
Share |
Playback duration |
1 |
1/3 |
1 |
1/3 |
1/5 |
1/5 |
Playback Duration/ Video duration |
3 |
1 |
1 |
1 |
1 |
1/2 |
Comments |
1 |
1 |
1 |
1/3 |
1/2 |
1/5 |
Download |
3 |
1 |
3 |
1 |
1 |
1/2 |
Collection |
5 |
1 |
2 |
1 |
1 |
1/2 |
Share |
5 |
2 |
5 |
2 |
2 |
1 |
For example, the number 3 in the first column of line fourth indicates that "download" is slightly more important than "duration of playback".
Scale of |
Meaning |
1 |
Represents the same importance compared to two elements |
3 |
Indicates that the former is slightly more important than the other two elements |
5 |
Represents two elements that are significantly more important than the latter |
7 |
Represents two elements that are more strongly important than the latter |
9 |
Represents two elements that are more important than the latter |
2,4,6,8 |
Represents the median of the above adjacent judgments |
Countdown |
If the ratio of the element to the importance is, then the element and element weigh The ratio of sex to |
Normalization of columns
[[0.05555556 0.0521327 0.07692308 0.05830389 0.03508772 0.06896552]
[0.16666667 0.15797788 0.07692308 0.17667845 0.1754386 0.17241379]
[0.05555556 0.15797788 0.07692308 0.05830389 0.0877193 0.06896552]
[0.16666667 0.15797788 0.23076923 0.17667845 0.1754386 0.17241379]
[0.27777778 0.15797788 0.15384615 0.17667845 0.1754386 0.17241379]
[0.27777778 0.31595577 0.38461538 0.35335689 0.35087719 0.34482759]]
Line sum
[0.34696846 0.92609846 0.50544522 1.07994462 1.11413265 2.0274106]
Re-normalization:
[0.05782808 0.15434974 0.08424087 0.17999077 0.18568877 0.33790177]
Playback duration |
Playback Duration / Video duration |
Comments |
Download |
Collection |
Share |
0.05782808 |
0.15434974 |
0.08424087 |
0.17999077 |
0.18568877 |
0.33790177 |
You get the weight of a metric.
For each of the indicators of the score, but also to be normalized, such as the score is limited to 0-1.
If a user scores each indicator of a video
Playback duration |
Playback Duration/ Video duration |
Comments |
Download |
Collection |
Share |
0.9 |
0.8 |
1 |
0 |
0 |
0 |
The score is weighted summed (or averaged) to get the user's rating for the video.
Conformance Testing
Theoretically, the results are: if A is exactly the same as the paired comparison matrix, there should be
But in practice it is not possible to meet the many of these equations when constructing a comparison matrix. Therefore, it is necessary to have a certain consistency in the comparison matrix, that is to allow a certain degree of inconsistency in the comparison matrix.
It is known from the analysis that the maximum eigenvalue of the exact paired comparison matrix is equal to the dimension of the matrix. The consistency requirement of the paired comparison matrix translates to the requirement that the maximum eigenvalue of the matrix and the dimension of the matrix are not significant .
Python Code implementation
Import NumPy as Npimport numpy.linalg as Nplgda = Np.loadtxt ("data.csv") sum= np.sum (da,axis=0) Col_arv = DA/SUMW = Np.sum (c Ol_arv,axis=1) W_n = W/np.sum (w) print W_nprint Np.max (Nplg.eig (DA) [0])
The output is:
[0.05782808 0.15434974 0.08424087 0.17999077 0.18568877 0.33790177]
(6.16381602081+0J)
Where the first line is the weight, the second line is the maximum eigenvalue. Obviously, the dimension of the matrix is 6, similar to the maximum eigenvalue, reasonable.
PS: If there is any better user behavior analysis and quantification of user product preference practices, welcome to Exchange!!!!!!!!!!
Resources:
Http://wiki.mbalib.com/wiki/%E5%B1%82%E6%AC%A1%E5%88%86%E6%9E%90%E6%B3%95
Http://courseware.ecnudec.com/zsb/zsx/zsx07/zsx07d/zsx07d000.htm
Http://www.tup.tsinghua.edu.cn/Resource/tsyz/035658-01.pdf
http://blog.csdn.net/huruzun/article/details/39801217
Http://www.cnblogs.com/broadview/archive/2013/02/27/2934925.html
This article link: http://blog.csdn.net/lingerlanlan/article/details/41917319
This article linger
Quantitative analysis of user's product preference by analytic hierarchy process