The previous article briefly introduced slope one. Algorithm The concept of C #.
Recommendation Based on the slope one algorithm requires the following data:
1. There is a group of users
2. There is a group of items ( Article , Products, etc)
3. Users will rate certain items (rating) to express their preferences.
The slope one algorithm solves the problem that a user knows the rating of some of the items and recommends some items that he has not rating to increase sales opportunities.
The implementation of a recommendation system includes the following three steps:
1. Calculate the rating difference between any two items
2. Enter a user's rating record to calculate the possible rating value for other items.
3. Sort the rating values and give top items;
Step 1: for example, we have three users and four items. The user scores are shown in the table below.
Ratings |
User1 |
User2 |
User3 |
Item1 |
5 |
4 |
4 |
Item2 |
4 |
5 |
4 |
Item3 |
4 |
3 |
N/ |
Item4 |
N/ |
5 |
5 |
In the first step, we work to calculate the score difference between two items, that is, to calculate the following matrix:
|
Item1 |
Item2 |
Item3 |
Item4 |
Item1 |
N/ |
0/3 |
2/2 |
-2/2 |
Item2 |
0/3 |
N/ |
2/2 |
-1/2 |
Item3 |
-2/2 |
-2/2 |
N/ |
-2/1 |
Item4 |
2/2 |
1/2 |
2/1 |
N/ |
Considering the weighting algorithm, we need to record how many people have scored the two items (freq). First, we define a structure to save rating:
Public class rating
{
Public float value {Get; set ;}
Public int freq {Get; set ;}
public float averagevalue
{< br> get {return value/freq ;}
}< BR >}< br> I decided to use a dictionary to save the result matrix:
public class ratingdifferencecollection: dictionary
{< br> private string getkey (INT item1id, int item2id)
{< br> return item1id + "/" + item2id;
}
Public bool contains (INT item1id, int item2id)
{
Return this. Keys. Contains <string> (getkey (item1id, item2id ));
}
Public rating this [int item1id, int item2id]
{
Get {
Return this [This. getkey (item1id, item2id)];
}
Set {This [This. getkey (item1id, item2id)] = value ;}
}
}
Next, we will implement the slopeone class. First, we will create a ratingdifferencecollection to save the matrix, and we will also create a hashset to maintain the total items in the system:
Public class slopeone
{
Public ratingdifferencecollection _ diffmarix = new ratingdifferencecollection (); // the dictionary to keep the diff Matrix
Public hashset <int> _ items = new hashset <int> (); // tracking how many items totally
Method adduserratings receives a user's score record (item-rating): Public void adduserratings (idictionary <int, float> userratings)
Adduserratings has two repeated loops. The outer loop traverses all the items in the input, and the inner loop traverses them again. The rating deviation between a pair of items is calculated and saved to _ diffmarix. Remember to add 1 to freq, to record this pair of items:
Rating ratingdiff = _ diffmarix [item1id, item2id];
Ratingdiff. Value + = item1rating-item2rating;
Ratingdiff. freq + = 1;
After each user calls adduserratings, a matrix is created. However, our matrix is saved as a table:
|
Rating dif |
Freq |
Item1-2 |
0 |
3 |
Item1-3 |
1 |
2 |
Item2-1 |
0 |
3 |
Item2-3 |
1 |
2 |
Item3-1 |
-1 |
2 |
Item3-2 |
-1 |
2 |
Item1-4 |
-1 |
2 |
Item2-4 |
-0.5 |
2 |
Item3-4 |
-2 |
1 |
Item4-1 |
1 |
2 |
Item4-2 |
0.5 |
2 |
Item4-3 |
2 |
1 |
Step 2: enter a user's rating record to calculate the possible rating values for other items:
Public idictionary <int, float> predict (idictionary <int, float> userratings)
It is also a two-loop. The outer loop traverses all items in _ items; the inner layer traverses userratings and uses this user's ratings to combine the matrix obtained in the first step, calculate the rating of each project in the system:
Rating itemrating = new rating (); // prediction of this user's rating
...
Rating diff = _ diffmarix [Itemid, inputitemid]:
Itemrating. Value + = diff. freq * (diff. averagevalue + userrating. value );
Itemrating. freq + = diff. freq;
Step 3: after obtaining the user's rating prediction, you can sort it by rating and recommend it to the user. test:
dictionary userrating = new dictionary ();
userrating. add (1, 5);
userrating. add (3, 4);
idictionary predictions = test. predict (userrating);
foreach (VAR rating in predictions)
{< br> console. writeline ("item" + rating. key + "rating:" + rating. value);
}< br> output:
Item 2 rating: 5
Item 4 rating: 6
improvement:
observe the previously Generated Matrix and find that there is a lot of waste of space. For example, there will never be a value on the diagonal line. this problem has been avoided because we use a linear table to store Matrix Values.
the values below the diagonal line are very symmetric with those above the diagonal line, the value below is equal to the value above multiplied by-1, which is a great waste of data. we can modify the ratingdifferencecollection to improve it. you can modify the getkey method and use item pair as the key:
private string getkey (INT item1id, int item2id) {
return (item1id }< br> complete Code here. net 3.5 debugging passed;
references
tutorial about how to implement slope one in python