Machine learning--a brief introduction to recommended algorithms used in Recommender systems _ machine Learning

Source: Internet
Author: User

In the introduction of recommendation system, we give the general framework of recommendation system. Obviously, the recommendation method is the most core and key part of the whole recommendation system, which determines the performance of the recommended system to a large extent. At present, the main recommended methods include: Based on content recommendation, collaborative filtering recommendation, recommendation based on association rules, based on utility recommendation, based on knowledge recommendation and combination recommendation.

First, based on content recommendation

Content-based recommendation (content-based recommendation) is the continuation and development of information filtering technology, it is based on the content of the project to make recommendations, without the need for the evaluation of the project according to the user opinion, more need to use the machine The learning method of the device obtains the user's interest information from the case of describing the feature of the content. In the Content-based recommendation system, the project or object is defined by the attribute of the related feature, the system is based on the characteristics of the user evaluation object, learning the user's interest, and examining the matching degree between the user data and the project to be predicted. The user's data model depends on the learning methods used, such as decision tree, neural network and vector based representation. content-based user data needs to have user's historical data, and user data model may change with user's preference change.

The advantages of content-based recommendation methods are:
1 No other user data, no cold start problem and sparse problem.
2) can be recommended for users with special interests and hobbies.
3 can recommend new or not very popular projects, no new project issues.
4 by listing the content features of the recommended items, you can explain why those projects were recommended.
5 There are better technologies, such as on the classification of learning technology has been quite mature.

The disadvantage is that the content can be easily extracted into meaningful features, requiring a good structural feature content, and the user's taste must be able to use the form of content features to express, can not be explicitly obtained by other users of the judge.

Second, collaborative filtering recommendation

Collaborative filtering recommendation (collaborative filtering recommendation) technology is one of the earliest and most successful technologies in the recommendation system. It generally uses the nearest neighbor technology, uses the user's historical preference information to calculate the distance between the user, then uses the target user's recent neighbor user to the commodity appraisal weighted evaluation value to predict the target user to the specific commodity preference degree, the system thus according to this liking degree to the target user to recommend. The greatest advantage of collaborative filtering is that there are no special requirements for the recommended objects, and can handle unstructured complex objects such as music and movies.

Collaborative filtering is based on the assumption that a good way to find out what he really is interested in is to find other users who have a similar interest in the user, and then recommend the content they are interested in to the user. The basic idea is very easy to understand, in daily life, we often use the recommendations of good friends to make some choices. Collaborative filtering is the application of this idea to the e-commerce recommendation system, which is based on the evaluation of a certain content by other users to recommend to the target users.

The recommendation system based on collaborative filtering can be said to be recommended from the user's point of view, and is automatic, namely the user obtains the recommendation is the system from the purchase pattern or the browsing behavior and so on implicitly obtains, does not need the user diligently to find suits own interest the recommendation information, like fills in some survey forms and so on.

Compared with content-based filtering, collaborative filtering has the following advantages: 1 can filter information that is difficult to perform automatic content analysis of machine, such as artwork, music, etc. 2 share the experience of others, avoid the incomplete and imprecise content analysis, and can be based on some complex, difficult to describe the concept (such as information quality, personal taste) to filter. 3 has the ability to recommend new information. You can find completely dissimilar information on the content, the user's content of the recommended information is not expected beforehand. This is also a large difference between collaborative filtering and content-based filtering, and many of the content based filtering recommendations are content that users are familiar with, and collaborative filtering can detect potential interest preferences that users have yet to discover. 4) can effectively use other similar user's feedback information, less user's feedback quantity, speeds up the personalized study speed.

Although collaborative filtering has its application as a typical recommendation technique, there are still many problems to be solved in collaborative filtering. The most typical problems are sparse problems (sparsity) and extensible problems (scalability).

Recommendation based on association rules

The recommendation based on Association Rules (Association rule-based recommendation) is based on the association rules, the purchased goods as the rule head, the rule body is the recommended object. Association rules mining can find the relevance of different products in the sales process, has been successfully applied in the retail industry. Management rule is in a transaction database in the purchase of a set X transactions in a large proportion of the transaction at the same time to buy a set of Y, the intuitive meaning is that users in the purchase of certain goods when there is a tendency to buy some other goods. For example, when buying milk, many people buy bread at the same time.

The first step of association rule discovery is the most critical and time-consuming algorithm, but it can be done off-line. Secondly, the synonym of commodity name is also a difficult point of association rules.

Iv. based on utility recommendation

Utility based recommendation (utility-based recommendation) is based on the usefulness of the user project, the core issue is how to create a utility function for each user, so the user data model is very large Degree is determined by the utility function used by the system. The benefit of the utility recommendation is that it takes into account the utility calculation of the properties of the vendor, such as the provider's reliability (reliability) and the product's availability (products availability).

V. Based on Knowledge recommendation

Knowledge based recommendations (knowledge-based recommendation) can, to some extent, be viewed as an inference (inference) technology that is not recommended based on user needs and preferences. The knowledge-based approach has a distinct difference because of the different functional knowledge they use. Utility knowledge (functional knowledge) is a kind of knowledge about how a project satisfies a particular user, so it can explain the relationship between need and recommendation, so the user data can be any knowledge structure that can support inference, it can be a user's already normalized query, It can also be a more detailed representation of the user's needs.

VI. Combination Recommendation

Since various recommended methods have advantages and disadvantages, in practice, combinatorial recommendations (Hybrid recommendation) are often used. The most research and application is the combination of content recommendation and collaborative filtering recommendation. The easiest way to do this is to use a content-based approach and collaborative filtering recommendations to produce a recommended forecast, and then combine the results with a method. Although there are many recommended combinations of methods in theory, but not in a specific problem is not always effective, the combination of recommendations is the most important principle is to be able to avoid or make up for each of the recommended technology weaknesses.

In combination, some researchers put forward seven kinds of combination ideas: 1 weighting (Weight): Weighted multiple recommendation technology results. 2 transform (Switch): Depending on the problem background and the actual situation or requirements to determine the transformation of the use of different recommendation techniques. 3) Hybrid (Mixed): At the same time using a variety of recommended technology to provide users with a variety of recommended results for reference. 4 feature combination (Feature combination): The combination of features from different recommended data sources is adopted by another recommendation algorithm. 5) Cascade (Cascade): first with a recommendation technology to produce a rough recommendation results, the second recommendation on the basis of the recommended results to further make more accurate recommendations. 6 feature extension (Feature augmentation): A technique that produces additional feature information embedded in the feature input of another recommendation technique. 7) meta-level (META-LEVEL): A model produced using a recommendation method is used as input to another recommended method.

Vii. comparison of the main recommended methods

Various recommended methods have their own advantages and disadvantages, as shown in table 1.

Table 1 main recommended methods the advantages and disadvantages of recommended methods based on the recommendation of the content recommended results intuitive, easy to explain; lack of domain knowledge sparse problem; new user problem; complex attribute is difficult to deal with; to have enough data to construct classifier collaborative filtering recommend novelty interest discovery, no need for domain knowledge; performance improvement over time; High degree of personalization and automation, can handle the sparse problem of complicated unstructured objects; scalability issues, new user issues, quality depends on historical data sets, recommended quality at the beginning of the system, new points of interest can be found based on rule recommendation, no domain knowledge rule extraction difficult, time-consuming, product name synonymous with low degree of personalization; No cold start and sparse problem for utility recommendation; It is sensitive to the change of user's preference; The user must input the utility function, the recommendation is static, the flexibility is poor, the attribute overlap problem, and the knowledge recommendation can map the user's requirement to the product; It is difficult to obtain the knowledge of non-product attributes; recommendation is static

Recommended algorithms for user recommendation (USERCF) and item recommendation (ITEMCF) contrast one, define USERCF: Recommend items that are similar to those that he has a common interest ITEMCF: recommend items that resemble what he likes before

According to user recommendations focus is the response and user interest similar to small groups of hot spots, according to the article recommended focus on the past with the user's historical interest, namely: USERCF is a group of items in the popular degree of ITEMCF is the response to my interests, more personalized second, the news class site using USERCF reasons: Most users like hot news, the special fine granularity of personalized can be ignored personalized news recommendation more emphasis on hot, popularity and effectiveness is the recommended focus, personalized importance can be reduced itemcf need to maintain an item related to the table, when the volume of items updated too quickly, the maintenance of this table is technically difficult. News web site for new users can directly recommend hot news for the electric business, music, books and other sites, ITEMCF advantage is greater: the user's interest is more fixed and durable; Do not need to think too much of popularity, only to help users find his field of research related items can be a technical perspective USERCF need to maintain a user similarity matrix ITEMCF need to maintain an item similarity matrix third, the advantages and disadvantages of the project USERCF ITEMCF performance is applicable to less users, if too many users, the cost of computing the user's similarity matrix is applicable to the number of items is significantly less than the number of users of the occasion, If there are many items, the cost of calculating the similarity matrix of goods in the field of the actual effect of the high demand, the user's personalized interest requirements are not high long tail goods rich, user personalized demand strong real-time users have new behavior, not necessarily need to recommend the results immediately change the user has new behavior, will lead to a real-time change of recommended results
Cold start after the new user has acted on a few items, it is not possible to personalize him immediately, because user similarity is off-line  
a period of time after a new item is online, a user can recommend new items to other users as soon as they have an action on the item. Can recommend the relevant items to him, but can not be updated without offline update the item Similarity table in the case of new items recommended to the user to recommend the reason is difficult to provide can be based on the user's historical behavior induction recommended reasons

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.