Referral System-Leveraging user behavior data

Source: Internet
Author: User

Description of user behavior data:The user's behavior mainly divides into two kinds-explicit feedback behavior and the recessive feedback behavior, the explicit feedback behavior mainly includes the scoring and likes/dislikes, the YouTube originally uses is uses the five Points appraisal system, but only then the user is very dissatisfied and the special satisfaction only then will be graded, therefore has turned it into the two level scoring system. Implicit feedback behavior is the browsing behavior of the page. User's behavior analysis:Most of the user's data distribution satisfies a long tail distribution, that is, the frequency of each word appears inversely proportional to his ranking in the popular leaderboard. Reflected in the network behavior is the more new users tend to choose popular products The more the older users tend to unpopular products. The recommended algorithm based on user's behavior is collaborative filtering algorithm, including neighborhood-based algorithm, semantic model based on graph random walk algorithm, and neighborhood-based algorithm including user-based, and item-based we often use accurate recall coverage when evaluating algorithms (the final recommendation list contains a large proportion of objects Product, if all items are recommended to at least one user, then coverage is 100%) popularity (the higher the popularity of the recommended items are less novelty) neighborhood-based algorithms:User-based collaborative filtering (1) Find user collections that are similar to those of interest to target users, and (2) find referrals to target users that users like in this collection and not seen by the target user
Calculation formula (Jaccard): Cosine similarity: where n (U) is the object of interest to u users; but this leads us to calculate a large amount of empty time, we also need to calculate the denominator, resulting in the waste of computing resources. So we build items to the user's inverted list,can scan each inventory corresponding to all users, each two pairs of users corresponding C[U][V] plus 1,c for our users and users of the matrix improvement: two user behavior similar to popular items does not explain the similarity of two people's interest, Only the same attitude to the less popular items can prove that two people are similar in interest, so we add a weight to each item the more unpopular the inverse of the item's weight, N (i) for the person who likes the article I
Item-based Collaborative filtering: item-based collaborative filtering is currently the most useful algorithm, such as Amazon YouTube, which can use the user's historical behavior to provide explanations for the results of the referral system. (1) Calculating the similarity between items (2) Generate a referral list similarity to the user based on the similarity of the item and the user's historical behavior: similar to the above method in order to avoid excessive waste of computing resources, we first set up a user-item inverted list. We can get a table of things-the relationship between items. Also to avoid the interference of the very active user on the relational table (they may add 1 to the location of all the relational tables) and therefore limit their contribution
Comparison: USERCF is often used for news recommendations, ITEMCF is often used for shopping site video recommendations USERCF recommendation results value and reflect the interests of users similar to small groups of hot spots, more social. Compared to coarse granularity, so it is particularly suitable for the news of this no fine-grained personality exists in the field, and the news update very fast is the item more, maintain a user-user relationship table than maintain a item-item table more easy, Popularity and effectiveness is the focus of personalized news recommendation and ITEMCF algorithm more attention to fine-grained personality, to maintain the user's historical interest, to help users find and his field related items, so books, e-commerce, movie sites more inclined to ITEMCF, This need to maintain a item-item table, the product similarity is relatively stable, the new user as long as the behavior of an item, you can recommend to him related items.
Semantic model (LFM):The previous approach was to use the statistical method, the semantic model is mainly used to optimize the means, first find the user's interest classification, and then select his favorite items from the classification, LFM by the following ways to calculate the user's interest in item I
P measures the user u to the implicit classification K, q measures The relationship between the I object and the implicit classification K, because it is to machine learning, so we can randomly sample to get negative samples, both users do not produce interactive data, the final optimization of the formula is:
Using the simplest gradient descent method, K is our training set. The LFM model can not adjust the result because of the real-time change of user behavior, in order to solve this problem we will add a part of R,
Here X is the historical interest of U users, based on the user's history, and Y is the content attribute of the item. So that when the news just appeared, you can determine who is going to recommend the news.
Comparison of LFM and neighborhood-based recommendations: a neighborhood-based approach is required to maintain a related table offline, if the user or items will be a lot of time to stay in a large space O (n*n) but in lfm false with F hidden class space is O (f (n+m)), when the user once there is a new trend, ITEMCF can update the list of recommended users in real-time, but LFM needs to calculate the user's interest weights for all items and then rank them, so it is more suitable for systems with fewer items. Graph-based model, the method of random walk can be used to get the rank of the object, but the time complexity of this method is very high, it needs multiple iterations to converge.
attached: (SVD decomposition in the application of recommender systems)

Reference from: http://www.igvita.com/2007/01/15/svd-recommendation-system-in-ruby/

In fact, it is inaccurate to say that the reference should be half-translated and half-study notes.

After a careful finishing, it feels like a great harvest.

Related knowledge of linear algebra:

Any one M*N of the matrices A ( M行*N列 , M>N ), can be written as a product of three matrices:

1. U: (M-Row M-column Lie Zheng intersection matrix)

2. S: ( M*N diagonal matrix, matrix element non-negative)

3. V: ( N*N inversion of the orthogonal matrix)

i.e. A=U*S*V‘ (note that the Matrix V needs to be inverted)

Intuitively, say:

Suppose we have a matrix in which each column represents a user, and each row represents an item.

For example, Ben,tom .... Represents User,season N for item.

The matrix value represents the score (0 for no rating):

If Ben to season1 evaluation divides into 5,tom to season1 comment divides 5,tom to Season2 not to score.

Machine learning and Information retrieval:

One of the most fundamental and interesting features of machine learning is the relevance of the data compression concept.

If we can extract some meaningful value from the data, we can use fewer bits to represent the data.

From the point of view of information theory, there is a correlation between data, which is compressible.

SVD is used to compress a large matrix in a way that reduces the number of dimensions in a lossy manner.

dimensionality reduction:

Below we will demonstrate the specific process of SVD with a specific example.

The first is a matrix.

A =     5     5     0     5     5     0     3     4     3     4     0     3     0     0     5     3     5     4     4     5     5     4     5     5

(represented by the scoring matrix)

To invoke the SVD function using matlab:

[U,S,VTRANSPOSE]=SVD (A) U =- 0.4472-0.5373-0.0064-0.5037-0.3857-0.3298-0.3586 0.2461 0.8622-0.1458 0.0780 0.2002-0. 2925-0.4033-0.2275-0.1038 0.4360 0.7065-0.2078 0.6700-0.3951-0.5888 0.0260 0.0667-0.50 0.0597-0.1097 0.2869 0.5946-0.5371-0.5316 0.1887-0.1914 0.5341-0.5485 0.2429S = 17.7 139 0 0 0 0 6.3917 0 0 0 0 3.0980 0 0 0 0 1.3290 0 0 0 0 0 0 0 0Vtranspose = -0.5710-0.2228 0.6749 0.4109-0.4275-0.5172-0.6929 0.2637-0.3846 0.8246-0.2532 0.3286 -0.5859 0.0532 0.0140-0.8085 

After decomposing the matrix, we need to understand the meaning of s first.

You can see that s is very special and is a diagonal matrix.

Each element is non-negative, and in turn decreases, specifically to understand the meaning of the element value and the linear algebra of the eigenvectors, eigenvalues.

But it can be broadly understood as follows:

In linear space, each vector represents a direction.

Therefore, the eigenvalue is the weight of the change in the direction of the characteristic vector which corresponds to the characteristic value of the matrix.

So you can take the s diagonal forward k elements.

When k=2 is about to s (6*4) descending Wi Cheng s (2*2) ,

At the same time U(6*6) , it Vtranspose(4*4) becomes U(6*2) , accordingly Vtranspose(4*2) .

such as (the USV matrix element value in the picture and my own Matlab calculated USV matrix element value some positive or negative inconsistency, but the essence is the same):

At this point we multiply the u,s,v with the descending dimension to get A2.

A2=U(1:6,1:2)*S(1:2,1:2)*(V(1:4,1:2))‘ //matlab语句
A2 =    5.2885    5.1627    0.2149    4.4591    3.2768    1.9021    3.7400    3.8058    3.5324    3.5479   -0.1332    2.8984    1.1475   -0.6417    4.9472    2.3846    5.0727    3.6640    3.7887    5.3130    5.1086    3.4019    4.6166    5.5822

At this point we can intuitively see that A2 and a are very close, which is what was said before the dimensionality can be regarded as a lossy compression of data.

Next we begin to analyze the correlation of the data in the matrix.

We treat the first column of U as the X value and the second column as the Y value. That is, each line of U is represented by a two-dimensional vector, and each line of the same v is represented by a two-dimensional vector.

Such as:

Can be seen:

Season5,season6 is very close. Ben and Fred are also very close.

At the same time we look at the a matrix to find that the 5th line vector of the A matrix is particularly similar to the 6th line vector, and that Ben's column vector is very similar to the column vector where Fred is located.

So intuitively we find that the U matrix and the V matrix can be approximated to represent a matrix, in other words, a matrix is compressed into u matrix and V Matrix, as for the compression ratio depends on the number of K before the S matrix is the K value.

We have finished half of it here.

Looking for similar users:

Examples are still used to illustrate:

We assume that there is now a new user named Bob and that the user is known to have a scoring vector of season N: [5 5 0 0 0 5]. (This vector is a column vector)

Our task is to make a personalized recommendation to him.

Our idea first is to use the scoring vectors of the new user to find the user's similar users.

For example (the second line in the figure has an error, Bob's transpose should be a row vector).

To not prove the formula in the diagram, only need to know the conclusion, the conclusion is to get a two-dimensional Bob's vector, known as Bob's coordinates.

Add the bob coordinates into the original diagram:

Then find the user who is most similar to Bob.

Note that the most similar is not the closest user, here the similarity is computed by the cosine similarity. (There are many ways to calculate similarity, each with its advantages and disadvantages)

That is, the minimum user coordinates of the angle and Bob.

You can calculate that the most similar user is Ben.

The next recommendation strategy depends entirely on the individual choice.

Here's a very simple recommendation strategy:

Find the most similar user, that is, Ben.

Watch Ben's scoring vector: "5 5 3 0 5 5".

Compare Bob's Scoring vectors: "5 5 0 0 0 5".

Then find the item that Ben scored and Bob did not score and sort, "season 5:5,season 3:3".

The item that is recommended to Bob is season5 and Season3 in turn.

Finally, there are some improvements to the whole recommendation idea:

1.

SVD itself is a computational process with a high degree of time complexity, which may be unbearable if the amount of data is large. However, you can use machine learning methods such as gradient descent to approximate calculations to reduce time consumption.

2.

Similarity calculation method, there are many similarity calculation methods, each has corresponding advantages and disadvantages, for different scenarios using the most appropriate similarity calculation method.

3.

Recommended strategy: First of all, similar users can multiple, each by the similarity as a weight to jointly affect the recommended item rating.

Referral System-Leveraging user behavior data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.