In the recommendation system of the previous Article-video rating prediction, I put forward the following points:
==================================I. What is the meaning of Theta/X?==================================
The first is the model: The following model1/model2 all simplify the user thinking.
(1) model1: given X = (romance, Action), return to optimize Theta.
First, the model gives meaning (movie type) to X, so we can think that Theta also has meaning (like romance or action ?)
(2) model2: Given Theta = (0, 5, 0), regression optimization X.
This model also gives Theta meaning (like romance or action ?), So X also follows the meaning (movie type romance or action ?)
What problems does model1/model2 have? -- Think of the user as too simple
Will the users who like action movies score high for all action movies?
Readers will not be aesthetic. It is also related to actors/directors.
The question is that the model (not the regression model) is built with simple structure. Of course, the problem modeling can be from simple to accurate, which is also based on scientific ideas.
Complex models may be difficult to solve at this time (no solution is found or the hardware facilities cannot keep up with the other issues). Instead, we will study simple models.
In conclusion, model1 and model2 are:
Given a parameter (theta or X) to optimize another (X or theta) using a regression model, the meaning given to the parameter can be used to explain the prediction parameter.
(3) model3: Collaborative Filtering-optimizing x/Theta at the same time
This is different from the "linear regression" we learned in machine learning,
The values in ml are similar to those in model1/model2: given (Y, x) or (Y, theta)
Now we have given Y to X/Theta, which is the first time we can see it.
From the optimization above, we can't see the convex nature of J (x, theta). Is the result of NG optimization the global optimization?
This problem should be clear, but for the recommendation system book does not read much (there are not many excellent reference books), visual testing should be the global best.
This problem will be solved later when you go deep into the recommendation system.
Now let's assume that model3 has obtained the correct solution, but what does x/Theta mean? Here, the dimension of X/Theta is given by us. The Code uses feature = 10.
I don't think it makes any sense?
Or what do we mean "? Why is it meaningful? Are we broken by model1/model2? Must I find a meaning?
Therefore, it does not make sense to consider the purpose of "returning to Y ";
It doesn't make sense. What are our predictions for ungraded items? -Without model1/model2's model explanation, model3's prediction seems meaningless (meaning ...).
This thought of Yang Yi's reply to a young man: "There are too few books and too many thoughts ".
These questions are not discussed in depth. You can read them or read paper to learn about the current research.
=========================================== 2. Do you want to add component 1 to X-feature? ==================================
I want to do a numerical experiment. code is used for collaborative filtering, and X/Theta are all unknown random variables. Therefore, you cannot add component 1 to X-feature.
Then we made the line mean, so for model1: given X = (romance, Action), regression optimization Theta is a bit interesting.
... I got a wrong conclusion. I just checked it out. It's funny.
So the question can be more general: In machine learning, why does "linear regression" add component 1 to X-feature? -- I found it clearly and put it in the comment.
==================================Iii. Significance of X-feature==================================
This has been discussed in <1>, and I will sort it out here:
- The model of model1/model2 gives meaning to a parameter (not necessarily romance/action, or other), and the other parameter also makes sense.
- Model3 makes no sense.
==================================Iv. Meaning==================================
"Content-based recommendation ":
I have read collective smart programming. Chapter 2 of this book: provides recommendations and is also about the recommendation system. Here we mention "user-based filtering" and "item-based filtering", which are similar to this name, but they seem quite different. To be checked.
"Collaborative filtering ":
What I think of this is beer and diapers. It seems to be different from the regression model here. The beer diapers are developed by Bayesian, which is to be checked.
==================================5. Collaborative Filtering code==================================
% Machine learning online classclear, CLC % Part 1: Entering ratings for a new usermovielist = loadmovielist (); my_ratings = zeros (1682, 1); my_ratings (1) = 4; my_ratings (98) = 2; my_ratings (7) = 3; my_ratings (12) = 5; my_ratings (54) = 4; my_ratings (64) = 5; my_ratings (66) = 3; my_ratings (69) = 5; my_ratings (183) = 4; my_ratings (226) = 5; my_ratings (355) = 5; fprintf ('\ n \ nNew user ratings: \ n'); for I = 1: length (my_r Atings) If my_ratings (I)> 0 fprintf ('ated % d for % s \ n', my_ratings (I ),... movielist {I}); endend % Part 2: Learning movie ratings % now, you will train the collaborative filtering model on a movie rating % dataset of 1682 movies and 943 users % fprintf ('\ ntraining collaborative filtering... \ n'); % load dataload ('ex8 _ movies. mat '); % visible score ratio is relatively small fprintf (' ------ user score percentage: % F ------- \ n', sum (R (:))/numel (R) % add Our own ratings to the data matrixy = [my_ratings y]; r = [(my_ratings ~ = 0) R]; % normalize ratings [ynorm, ymean] = normalizeratings (Y, R); % useful valuesnum_users = size (Y, 2); num_movies = size (Y, 1); num_features = 10; % set initial parameters (Theta, x) x = randn (num_movies, num_features); Theta = randn (num_users, num_features ); initial_parameters = [x (:); theta (:)]; % set options for fmincgoptions = optimset ('gradobj ', 'on', 'maxiter, 200 ); % set regularizationlambda = 10; % the optimization part calculates x Theta to minimize costfunction % the original code is incorrect here: Change y to ynormeta = fmincg (@ (t) (coficostfunc (T, ynorm, r, num_users, num_movies ,... num_features, lambda )),... initial_parameters, options); % unfold the returned Theta back into U and wx = reshape (theta (1: num_movies * num_features), num_movies, num_features ); theta = reshape (theta (num_movies * num_features + 1: end ),... num_users, num_features); fprintf ('recommender system learning completed. \ n'); % Part 3: Recommendation for you % after training the model, you can now make recommendations by computing % the predictions matrix. P = x * Theta '; my_predictions = P (:, 1) + ymean; movielist = loadmovielist (); [R, IX] = sort (my_predictions, 'descend '); % The code here is slightly incorrect. Since it is recommended, the excessive user rating movie should be filtered out. % the following code contains fprintf ('\ ntop recommendations for you: \ n'); for I = J = IX (I); fprintf ('predicting rating %. 1f for movie % s \ n', my_predictions (J ),... movielist {J}); endfprintf ('\ n \ noriginal ratings provided: \ n'); % The following is a simple modification, added comparison fprintf ('---------- movie ------------------ initialization rating ---------- prediction rating ------- \ n') for I = 1: length (my_ratings) If my_ratings (I)> 0 fprintf ('%-36 S % d % 4.3d \ n', movielist {I}, my_ratings (I), my_predictions (I) endend
Output:
Part 1: Entering ratings for a new user
Part 2: Learning movie ratings
Part 3: Recommendation for you