Recommended Algorithm Considerations Summary

Source: Internet
Author: User

This paper summarizes the considerations in various recommendations, or the key factors in the recommendation, does not involve the description of the algorithm, only the point of concern, for reference only.

There are a number of recommended algorithms, from the point of view of the algorithm, I think mainly by the following: Collaborative filtering series (based on item and user), machine learning Classification series (like and do not like two classification, or the score in the regression represents the degree of liking), Matrix decomposition series (Mahout als algorithm, Netflix holds the award winning algorithm), association rules (commonly used by e-commerce). This article will be summarized from the above several series.

First, Collaborative filter series

The collaborative filtering algorithm is one of the hottest and most accurate algorithms. In SNS dating recommendation, the current collaborative filtering algorithm is the most accurate algorithm (personal see), the reasons are as follows: 1, dating site users of the basic attribute is not enough to fully express a person, unless the person's appearance information can now be fully expressed, after all, strangers friends first look at the appearance. 2, the user's behavior has put the user's description of other users into the action.

Considerations for using Collaborative filtering:

1, the user's behavior consideration as far as possible, can be different hermit behavior to a certain weight value, the hermit behavior synthesis to get the score displayed.

2, the recommendation to reorder the results, to maximize the benefits, such as dating sites should be added active factors and exposure factors, active factor rise function, exposure factor reduction function, control recommended active users, to ensure the user exposure rate balance.

3, for the recommendation of dating sites, positive recommendation is the user may like, if the user and the behavior of the user is reversed and the rating value is not changed, so recommended to be the behavior of the user may like him, will be the intersection of the two, it is possible that the two sides like each other, that is recommended the highest level of reciprocity recommendation.

4, based on the collaborative filtering of items, similarity to the normalization can help improve the diversity of recommendations. Because of the imbalance in the internal similarity of different items, the similarity of some items is higher, which results in the preference for such items. For two different classes, what class has a high similarity between the items in their class, and what kind of objects have low similarity in their class? Generally speaking, the category of the popular categories of goods similarity is generally relatively large. If normalization is not performed, it is recommended
Compare the items in the popular class, and these items are also more popular. Therefore, the recommended coverage is relatively low. Conversely, if the similarity degree is normalized, the coverage of the recommended system can be increased.

5, in Itembased CF, the Tanimoto coefficient than other coefficients will be more coverage, it does not care about the user's specific rating of the item is how much, it is concerned about the relationship between the user and the object. This results in a similar balance between different items, and does not favor an item.

6, for the user-based recommendation system, Pearson correlation coefficient than other users compared to the method of a better. The cosine similarity method is better than the Pearson correlation measure based on the item recommendation.

7, reduce the weight of popular items. There are a lot of things that many people love, making it more valuable for two of users to agree on controversial items (unpopular items). You can use anti-user frequency mitigation.

8, based on the recommendation of the article, the cosine similarity method is better than the Pearson correlation measure. In the object-based recommendation algorithm, the cosine similarity has been proved to be a standard measure system because of its precise effect. This method is also widely used for information retrieval and text mining to compare two documents, which are represented as vectors of words. The basic cosine method does not take into account the difference between the average user score, and the improved cosine method solves this problem by subtracting the average from the scoring value, and improving the cosine method's value between 1 and + 1, just like the Pearson method.

9, e-commerce recommendations need to add the time and purchase time factors, adding the season context factor.

Second, machine learning classification series

1, machine learning mode (logistic regression, decision tree, SVM). Use successful and failed datasets in historical data, train models, and use models for predictive recommendations. (Success stories and failure cases are more difficult to grasp)

2, the machine learning method will generally use more properties, only the basic properties of the effect will be poor, if the inclusion of a lot of properties, many users of many properties will be empty, data preprocessing is troublesome.

3, many models use collaborative filtering to recommend data, and then use logistic regression (using key attribute indicators) to calculate the recommendation index, the recommended data to sort.

Third, matrix decomposition series

1, the ALS algorithm in the Netflix competition has achieved very good results, ALS in the video and other small items, multi-user recommendation is easier to play, because the accuracy of this algorithm depends heavily on the number of feature, the recommended characteristic value of less items is lower, the computational capacity is small. For referrals such as social networking sites, referrals and referrals are people, and the number set is the same, and if you use the ALS algorithm, the feature value will be very high, the computation will be huge, and the feature will result in very inaccurate recommendations.

2, matrix decomposition series in the text clustering application more.

Iv. Recommendation of Association Rules

1, generally used for e-commerce website recommendations, such as the purchase of this item also bought what.

2, e-commerce website items placed, generally also rely on the association rules to adjust


Recommended areas of concern:

SNS Friend recommendation: Collaborative filtering recommended effect is generally the best, in the recommendation process to ensure the user exposure balance, weaken the popular user's recommendation, explore potential users.

E-Commerce website Recommendation: Association rules recommendations, collaborative filtering recommendations (based on items), focus on new items and seasonal items, to weaken the similarity of popular items, long tail items similar more valuable, long tail items can drive the company's benefits.

Video Site recommendation: Collaborative filtering, ALS algorithm, film review analysis mining






Recommended Algorithm Considerations Summary

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.