http://blog.csdn.net/pipisorry/article/details/44850971
Machine learning machines Learning-andrew NG Courses Study notes
Recommender Systems recommendation System
{An important application of machine learning}
Problem formulation Problem Planning
Note:
1. To allow 0 to 5 stars as Well,because This just makes some of the math come out just nicer.
2. For the example, I have loosely 3 romantic or romantic comedy movies and 2 action movies.
3. To look through the data and look at all the movie ratings that is missing and to try topredictwhat these Val UEs of the question marks should be.
Content Based recommendations-based recommendations
Note:
1. Add an extra feature Interceptor feature X0, which are equal to 1
2. Set n to being the number of features, not counting this X zero intercept term so n was equal to the because we ha ve-Features X1 and x2
3. To make predictions, we could treat predicting the ratings of each user as aseparate linear regression pro Blem. So specifically lets say this for each user J We is going to learn a parameter vector Theta J which would is in R n+1, WH ere n is the number of Features,and we ' re going to predict user J as rating movie I, with just the inner product between t He parameters Vector Theta and the features "XI".
4. Let's say that know we had somehow already gotten a parameter vector theta 1 for Alice. {Linear programming: For each film Alice has evaluated, it is a example, where example0 x = [0.9 0], y = 5, and a gradient descent to find theta}
Optimization algorithm:estimate of parameter vector Theta J
Note:
1. To simplify the subsequent math,get ridof the this term MJ. That ' s just a constant.
2. Because our regularization, here regularizes, the values of theta JK for K not equal to zero.wedon ' t Regula Rize Theta 0.
3. Can also plug them into a further advanced optimization algorithm like cluster gradient or LBFGS and use this to Try to minimize the cost function J as well.
4. content based recommendations, because we assume that we had features for the different movies.that capture WH At is the content of these movies. How romantic/action are this movie? And we are really using features of the content of the "movies to make our predictions.
Collaborative filtering Collaborative filtering
{CF has a interesting property:feature learning can start to learn for itself what features to use}
Note: We do not know the values of these features of movies. But assume we have gone to each of our users, and each of our users have told us how much they like the romantic movies and H ow much they like action packed Movies.each user J just tells us what is the value of theta J for them.
Optimization algorithm
Note:
1. Kind of a chicken and egg problem. So randomly guess some value of the Thetas. Now based on yourinitial random guessfor the thetas, you can then go ahead and use the procedure to learn Featur Es for your different movies.
2. By rating a few movies myself,the system learn better features and then these features can is used by the system to MAK E Better movie predictions for everyone else. And so there are a sense of collaboration where every user was helping the system learn better features for the common good. This was this collaborative filtering.
Collaborative Filtering algorithm Cooperative Transition algorithm
{More efficient algorithm this doesn ' t need to go back and forth between the X's and the Thetas, but that can solve for th ETA and X simultaneously}
Collaborative Filteringoptimization Objective
Note:
1. Sum over J says, for every user, the sum of all the movies rated by, user.for every movie I,sum over a ll the users J that has rated that movie.
2. Just something over all the user movie pairs for which has a rating.
3. If you were to hold the X's constant and just minimize with respect to the thetas then you ' d be solving exactl Y the first problem.
4. Previously we have been using this convention, we have a feature x0 equals one, corresponds to an INTERCEPTOR.W Hen we is using this sort of formalism where we ' re is actually learning the features,we is actually going to do AWA Y with feature x0. And so the features we is going to learn X, would be is in Rn.
Collaborative filtering algorithm
Vectorization_ Low rank matrix factorization vectorization
Implementational Detail_ Mean Normalization Implementation Details _ mean value
from:http://blog.csdn.net/pipisorry/article/details/44850971
Machine LEARNING-XVI. Recommender Systems recommendation System (Week 9)