Let's take a look at the process of using: 1) Get DataModel2) define similarity calculation modelpearsoncorrelationsimilarity3) Define user neighborhood calculation modelNearestnuserneighborhood4) define the recommendation modelGenericuserbasedrecommender5) Make recommendations
@Test Public voidTesthowmany ()throwsException {datamodel Datamodel=Getdatamodel (New Long[] {1, 2, 3, 4, 5}, Newdouble[][] {{0.1, 0.2}, {0.2, 0.3, 0.3, 0.6}, {0.4, 0.4, 0.5, 0.9}, {0.1, 0.4, 0.5, 0.8, 0.9, 1.0}, {0.2, 0.3, 0.6, 0.7, 0.1, 0.2}, }); //for computing the most similar users, domain usersUsersimilarity similarity =Newpearsoncorrelationsimilarity (Datamodel); Userneighborhood Neighborhood=NewNearestnuserneighborhood (2, Similarity, Datamodel); Recommender Recommender=NewGenericuserbasedrecommender (Datamodel, neighborhood, similarity); List<RecommendedItem> fewrecommended = Recommender.recommend (1, 2); List<RecommendedItem> morerecommended = recommender.recommend (1, 4); for(inti = 0; I < fewrecommended.size (); i++) {assertequals (Fewrecommended.get (i). Getitemid (), Morerecommended.get (i). Getitemid ()); } Recommender.refresh (NULL); for(inti = 0; I < fewrecommended.size (); i++) {assertequals (Fewrecommended.get (i). Getitemid (), Morerecommended.get (i). Getitemid ()); } }
Similarity calculation, refer to the pearsoncorrelationsimilarity of the previous article.
Nearestnuserneighborhood, how to get the nearest n users, how to achieve it?
~/mahout-core/src/main/java/org/apache/mahout/cf/taste/impl/recommender/genericuserbasedrecommender.java
@Override PublicList<recommendeditem> recommend (LongUseridintHowmany, Idrescorer Rescorer)throwstasteexception {preconditions.checkargument (Howmany>= 1, "Howmany must is at least 1"); Log.debug ("Recommending items for user ID ' {} '", UserID); //Calculates the most similar n users according to the similarity model Long[] Theneighborhood =Neighborhood.getuserneighborhood (UserID); if(Theneighborhood.length = = 0) { returncollections.emptylist (); } //get a list of the item that is scored by users in other areas and not rated by the current user as the recommended base poolFastidset Allitemids =Getallotheritems (Theneighborhood, UserID); //get inside the pool and recommend the TOPN with the highest current user preferencesTopitems.estimator<long> Estimator =NewEstimator (UserID, Theneighborhood); List<RecommendedItem> Topitems =topitems. Gettopitems (Howmany, Allitemids.iterator (), rescorer, estimator); Log.debug ("Recommendations is: {}", Topitems); returnTopitems; }
The implementation of the estimator is this:
Private Final classEstimatorImplementsTopitems.estimator<long> { Private Final LongTheuserid; Private Final Long[] theneighborhood; Estimator (LongTheuserid,Long[] theneighborhood) { This. Theuserid =Theuserid; This. Theneighborhood =Theneighborhood; } @Override Public DoubleEstimate (Long ItemID)throwstasteexception {returndoestimatepreference (Theuserid, Theneighborhood, ItemID); } }}
protected floatDoestimatepreference (LongTheuserid,Long[] Theneighborhood,LongItemID)throwstasteexception {//add a similar user's preference to the item, and then average it as the current user's preference for changing the item if(Theneighborhood.length = = 0) { returnFloat.nan; } Datamodel Datamodel=Getdatamodel (); DoublePreference = 0.0; Doubletotalsimilarity = 0.0; intCount = 0; for(LongUserid:theneighborhood) { if(UserID! =Theuserid) { //See Genericitembasedrecommender.doestimatepreference () tooFloat pref =Datamodel.getpreferencevalue (UserID, ItemID); if(Pref! =NULL) { DoubleThesimilarity =similarity.usersimilarity (Theuserid, UserID); if(!Double.isnan (thesimilarity)) {Preference+ = Thesimilarity *pref; Totalsimilarity+=thesimilarity; Count++; } } } } //Throw out the estimate if it is based on no data points, of course, but also if based on//just one. This was a bit of a Band-Aid on the ' stock ' item-based algorithm for the moment. //The reason is, and the estimate is, simply, the user's rating for one item//That happened to has a defined similarity. The similarity score doesn ' t matter, and that//seems a bad situation. if(Count <= 1) { returnFloat.nan; } floatEstimate = (float) (Preference/totalsimilarity); if(Capper! =NULL) {estimate=capper.capestimate (estimate); } returnestimate; }
Summarize:
1) Calculate the most similar n users
2) from the most similar n users, get the item that you have not scored
3) Anticipate your preference for each item
4) recommend the highest preference N item
Apache Mahout Source Reading notes-datamodel Userbaserecommender