Recommended Algorithm Learning Notes

Source: Internet
Author: User

The recommended algorithm is a simple example of a user coming in to see a bunch of content, and we embed all of the historical behavior he sees in the recommendation engine. This recommendation engine will generate a personalized channel, the next time the user login, or not the next time, after a few minutes, he saw the content will be based on his recent history behavior changes, this is the basic logic of the recommendation System . This method is called based on the user behavior of the recommendation, of course, there are certain limitations. For example, when you have only one user behavior, you don't know if he will look at something that no one has ever seen before, which is actually a long tail problem . When you can accumulate more and more users, the user's historical behavior will help you understand the long tail content.

The essence of recommendation System is to solve the problem of overload of information, contact users and information, and help users to find the information that is valuable to them, on the other hand, to show the information in front of the users who are interested in it. This enables both information consumers and information producers to win (the message here can be very broad in meaning, compared to books, movies, and commodities, collectively referred to as item).

The referral system is actually in service for three different stakeholder groups:

The first one: the user. The user is in order to be more convenient to find what he wants to see.

The second one: the platform itself. The platform wants to connect with service providers, content providers, and users, and he wants to make money.

Third: Content provider, because if the content provider can have more exposure, he will get clicks or/and brand effect on this channel, then he can be a few ways to cash out, whether it is advertising method or in some offline channel buy method.

Therefore, a recommendation algorithm to serve the three different interests of the relevant parties, which in itself leads to a contradiction.

One solution is to co-filter , for example, with a interactive method train, the two sides learn from each other, and then the chain, which is a relatively standard method:

Top picks for you this is one of the most standard record train, which is recommended train.

The second one is also like, which you might also enjoy, which is also a recommended train.

The third one is a subcategory. For example, the general category of movies is the romance, action film, in fact, this category is divided into a number of small categories, such as this is called the court cases, is actually the action film under a main category, this can also be produced by the recommended algorithm.

The recommendation engine is a deep learning approach, and the effectiveness of the presentation is largely determined by whether you have moved the user, you want to give her a good reason, so deep learning is a tool that can be used to do a lot of things, to master this tool, more flexibility.

For example, some shallow distributed representation models. In the text field, the shallow distributed representation model has been widely used, such as Word2vec, Glovec and so on. In contrast to traditional word-bag models, word-embedding models can map words or other information units (such as phrases, sentences, and documents) to a low-dimensional, implicit space. In this implicit space, the representation of each unit of information is a dense eigenvector. The basic idea of word embedding representation model actually comes from the traditional "distributional semantics", which summarizes that the semantics of the current word is closely related to its adjacent background words. Therefore, the modeling method of Word embedding is to use the embedded representation to construct the semantic association between the current word and the background word. Compared with multi-layer neural network, the training process of Word embedding model is very efficient, and the practical effect is good and the explanatory is good, so it has been widely used.

-----------------------------------------------------------------------------------------------

recommendation based on background or feature (Context-aware recommendation)

The continuous development of the recommendation system further enriches the information available for the recommended algorithms. For the news recommendation, the property of the item may be the text content of the news, keywords, time, etc., including the user's Click, collection and browsing behavior, and so on. On the e-commerce website, there may also be a lot of comment text (Review text), the history of the user viewing, the record of the user's purchase, etc. May also obtain the user's feedback information, generally can divide into two kinds: one is the explicit user feedback (Explicit Feedback), this is the user to the product or the information gives the explicit feedback information, the rating, the comment belongs to this class; the other is the implicit User feedback (implicit Feedback), which is generally the data generated by users in the process of using the site, they also reflect the user's preferences for items, such as the user to view the information of an item, the user's stay on a page, and so on. For background-sensitive recommendations, feature-based recommendation algorithms such as svd++, Svdfeature, and LIBFM can be used.

Complex recommendation Tasks

In the real recommendation, often have to face a lot of complex recommendation tasks. For example, a session-based recommendation task. In this task, the user makes successive actions and choices within a time fragment, requiring continuous consideration of the user's overall interest preference and behavior within a particular session. The workaround for this task is often related to the sequence model. Another complex task is called page-based recommendation. The results of the recommended tasks mentioned above are a single list, which in fact often requires a display of results based on user UI. For example, in an e-commerce platform, how to reasonably display the recommended products in the various parts of the page, possible strategies such as classification by category display, focus areas highlight personalized recommendations results. This task is rarely noticed in the study, mainly because it is difficult to obtain relevant scientific data.

Commonly used algorithms for philosophical recommendations

1 content-based related recommendations

Content-based recommendations generally rely on a good set of labeling system, by calculating the similarity between the tag set of items to measure the similarity between the item, a good set of labeling system needs to be polished, on the one hand, the need for good editing, on the one hand, also relies on the product design, guide users in the use of products in the process, Provide a high-quality tag for item.

2 related recommendations based on collaborative filtering

Collaborative filtering is mainly divided into domain-based and semantic-based models, which are also available.

In the domain-based algorithm, ITEMCF is currently the most widely used algorithm in the industry, the main idea is "like item a user most like user item B", by mining the user history of the operation log, the use of group intelligence, to generate a candidate list of item. The main statistics are 2 item co-occurrence frequency, time considerations, as well as hot users and popular item filter and drop right.

LFM (latent factor model) is the most popular research topic in the field of recommendation system in recent years, which was first proposed in the field of text mining to find the implicit semantics of text, in the recommendation field, its core idea is to contact users and objects by implication. The main algorithms are pLSA, LDA, Matrix factorization (svd,svd++), and so on, these techniques and methods are interlinked in nature, with LFM as an example, the following formula calculates the user's interest in item I:

The equations Pu,k and qi,k are parameters of the model, in which pu,k measures the interest of the user U and the relationship of the K-class, while the qi,k measures the relationship between the K-class and item-I. The qi,k can be seen as projecting item into the space of the hidden class, and the similarity of item is converted to the distance in the hidden space.

Item2vec:neural ITEM Embedding

1 Word2vec

2013 years, Google released the Word2vec tool has aroused everyone's hot, many internet companies follow up, produced a lot of results. 16 Oren Barkan and Noam Koenigstein for reference Word2vec thought, proposed Item2vec, through the shallow neural network combining sgns (Skip-gram with negative sampling) training, The item is mapped to the vector space of a fixed dimension, and the similarity between the item is measured by the operation of the vector.

2 Item2vec

Due to the great success of Wordvec in the NLP field, Oren Barkan and Noam Koenigstein was inspired by this, using item-based CF to learn item in low-dimensional latent space embedding Representation, optimize the relevant recommendations for item. The context of a word is a sequence of adjacent words, it is easy to think that the sequence of words is actually equivalent to a series of sequential operation of the item sequence, so the training corpus simply change the sentence to a continuous operation of the item sequence, the total of the item is a positive sample, and according to the frequency distribution of item negative sample sampling.

Resources

Https://www.ibm.com/developerworks/cn/web/1103_zhaoct_recommstudy1/index.html

Recommended Algorithm Learning Notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.