Reverse the recommendation mechanism from YouTube algorithm paper

Source: Internet
Author: User
Tags ming

Last year, a research team from Google published a deep-learning paper on the YouTube referral system at the 10th Annual ACM Recommendation System Conference in Boston (ACM's Recsys ' 16): Deepin Neural Networks for YouTube Recommendations

The author is Google's software engineer Jay Adams and senior software engineer Paul Covington, Embre Sargin, who showed the industry the deep neural network usage of YouTube on machine learning recommendation algorithms. The paper involves some very technical, very advanced details, but its greatest significance is to give us a complete picture of how the YouTube recommendation algorithm works!! The paper carefully and carefully describes the extensive details of YouTube's search, screening, and referral videos.

Up main analysis of algorithm papers

YouTube engineers publish algorithmic papers on ACM, and the target audience is obviously not our up master. But for traffic, it is incumbent to read and understand the content of the algorithm and let it serve our up masters. The next step is to see how we can parse this algorithm paper from the perspective of the up master.

Before the publication of the paper, we have an article on the analysis of the YouTube algorithm (see the AI technology base: "Want to let the video site to help you push content?") Look at how this little brother is wits with YouTube, the focus is on the duration of the watch, because we can only reverse the way YouTube works from the video data we upload, which is bound to be limited to the content and audience of our videos. The reason we know about YouTube's algorithm is that we solved the problem of doing the video on the road: "Why is our video so successful?" "To do this, we do our best to analyze the information we have, but the initial results are not ideal." Despite my 100% support for our conclusions, there are two major problems with our previous approach:

    • Using only a subset of the channel metrics means that we have a huge blind spot on the data, after all, we can't access competitive metrics, session metrics, and click-through rates.

    • For those indicators based on the up master, the YouTube algorithm gives very little weight. It is more concerned with the audience as well as the individual video indicators. In other words, the algorithm doesn't care about the video you're uploading, it's about the video you and others are watching.

But when we wrote the original article, YouTube or Google had not published any information about the algorithm for several years. So we have to do it ourselves. With Google's newly published paper, we can get a glimpse of its referral mechanism and find out the key indicators. Hopefully this will answer a more poignant question, "Why do some videos get successful?" ”

Deep learning is a bottomless pit.

The biggest highlight of the paper's introduction is that YouTube is really using deep learning to drive recommendation algorithms. The practice is not new, but this confirmation confirms the previous speculation. This is what the author says at the beginning of the paper:

In this paper, we will focus on the overall impact of deep learning on the YouTube video referral system ... Like other products in Google, YouTube has also experienced a fundamental paradigm shift in deep learning to solve all common learning problems.

This means that there will be fewer opportunities in the future to manually adjust algorithms, manually weigh these adjustments, and deploy them to the world's largest video-sharing site. Instead, the algorithm reads the data in real time, ranks the video, and then recommends the video based on these rankings. So, when YouTube says they don't know why the algorithm did that, it's possible they don't know.

Two big neural networks

This paper begins with the basic architecture of the algorithm, the following is the author's diagram:

650) this.width=650; "src=" "alt=" Picture description "title=" "style=" border:0px; Vertical-align:middle;margin:auto; "/>

This is essentially two large filters, each with different inputs. The author writes:

The system consists of two large neural networks, one for generating candidate videos and one for ranking them.

The two filters and their inputs basically determine every video that the user can see on the YouTubes: The next, recommended list of videos you're viewing, a list of videos you've browsed ...

The first filter is a candidate generator. The paper explains that the candidate is based on the user's YouTube activity record, which is the user's viewing history and viewing duration. The candidate builder also considers the browsing history of similar users, which is called collaborative filtering. Similar users are algorithms that are determined by video ID, search keywords, and related user statistics.

The candidate generator's pass rate is only 1%, in other words, if a video can stand out from hundreds of to be your candidate video, it must be related to your viewing record, while there is a user similar to you have seen it.

The second one is the ranking filter. In this paper, a lot of depth analysis of ranking filter, and a lot of interesting factors are listed in the J. The authors write that ranking filters are the way to rank videos:

Based on the rich features of the description video and the user, the target expectation function sets the score for each video. Depending on the score, the highest-scoring video will be displayed to the user.

Since the viewing time is the first goal that YouTube has set for users, we have to assume that this is the meaning of the objective expectation function. So, given the variety of user input, the significance of this score is the degree to which a video can be converted into a user's viewing duration. But unfortunately, it's not that simple. According to the author, the algorithm also takes into account many other factors.

We have used hundreds of features in the ranking filter.

How to rank the video the mathematical principle of this piece is very complicated. The paper does not detail the hundreds of factors used in ranking filters, and does not mention how they are weighted. But it lists three major factors: browsing history, search history, number of viewers, and other video elements, including freshness.

Every second there is a lot of video uploaded to YouTube. It is extremely important for YouTube to recommend these newly uploaded fresh content to users. The result of our long-term observation is that users like fresh content, even though some content is not very relevant to him.

The interesting point mentioned in the paper is that the algorithm is not always affected by the previous video that the user sees, unless your viewing record is extremely limited.

We use the user's random view and keyword search history first, before we consider the data from the previous watch video.

When we discuss the video cover and title later in the paper, they mention the issue of CTR:

For example, the user has a great chance to watch the system's recommended video, but is unlikely to click on its homepage based on the selection of the cover map ... Our final rankings are constantly adjusted to the results of a real-time A/b test, which is basically a simple function to predict the user's viewing duration.

The issue of click-through rate here is not unexpected. In order to generate more viewing time, a video must be visible first, and the best way to do that is to make a great thumbnail and create a great title. This makes many up owners think the CTR is extremely important for the video's ranking in the algorithm.

But YouTube knows that click-through rates can be artificially brushed up, so they also give a response. That's what they say in the paper:

Ranking by click-through rate will often encourage the inducement of the video content, the user even point in the video is rarely seen, so the watch can reflect the length of the video is good or bad.

At least the mechanism is encouraging (comparing the content-producing mechanisms of certain websites in the country), and the authors then write:

If the user does not watch the most recently recommended video, the next time the page loads, the model will automatically lower the video's ranking.

This means that if a user does not click on a specific video, the algorithm will no longer recommend it to a similar user. The same is true of the channel recommendation, the evidence in the paper is as follows:

The most important signal we have observed is to describe the user's interaction with a video and other similar videos. For example, consider the interaction of a user's video with a channel that has been scored by the algorithm: How many videos of the channel are viewed by the user? When did the user watch the last video on a similar topic? This type of data describes the user's past activities is particularly powerful ...

In addition, the paper points out that the algorithm takes into account all the viewing methods of YouTube videos during training, including where the recommended algorithms do not reach them:

Training data is generated from all viewing modes of YouTube videos (including those embedded in other pages), rather than just the recommended videos generated by us. Otherwise, the new content will be difficult to board the recommended list, and the recommendation system will be too dependent on the past video data. If a user finds a video that is different from our recommendation, we need to be able to quickly propagate the discovery to other users through the referral system.

Eventually, it all goes back to the viewing time used by the algorithm. As we saw at the beginning of the paper, the algorithm is a "target expectation function" at the beginning of the design, and the author concludes that "Our goal is to predict the user's viewing time", "Our final ranking will be adjusted according to the results of the real-time A/b test, which is basically a simple function to predict the user's viewing duration. “

This once again illustrates the importance of video viewing time to the algorithm, which is designed to have more, longer videos, and longer user viewing times on YouTube sites.

A simple review

Speaking so much, let's briefly review:

    • YouTube uses three main viewing factors to recommend videos, which are the user's viewing history, search history, and related user statistics.

    • The recommended video is filtered by candidate generators and ranking filters, which determine how YouTube reads, filters, and generates a list of recommendations.

    • Ranking filters are primarily based on user input factors, and other factors include the "freshness" and Ctr of the video.

    • The proposed algorithm is designed to continuously increase the user's viewing time on the YouTube site by continuously feeding the real-time results of the video A/b test to the neural network, so that YouTube can continue to recommend it to the user it is basically a simple function to predict the user's viewing duration.

If you don't understand, let's give another example.

We use an example to illustrate how this recommendation system works:

For example, Xiaoming likes YouTube, and he has everything about YouTube accounts. When browsing YouTube every day, he will log in in the browser. Once logged in, YouTube creates three tokens for the content that Xiaoming has browsed: browsing records, searching for records, and statistics about him. Xiaoming may not know the existence of these three kinds of data at all.

Then it's your turn to be the candidate generator. YouTube compares the values of these three tokens with a user watching a record similar to xiaoming, sifting through the hundreds of videos that xiaoming might like, filtering out millions of other content in the YouTube video library.

Next, based on the relevance of video and xiaoming, these videos are sorted by ranking algorithms. When sorting, the algorithm takes into account the question: How big is xiaoming likely to open this video? Is this video likely to allow Xiaoming to spend more time on YouTube? How fresh is this video? How much is Xiao Ming's recent activity on YouTube? There are hundreds of other questions.

After the YouTube algorithm reads, filters and recommends, the highest ranked video will be recommended to xiaoming. Then the selection data that Xiao Ming sees and does not see will be fed back to the neural network for subsequent use by the algorithm. The video was opened and the goal of attracting xiaoming to spend more time on YouTube continued. Those Xiao Ming did not open the recommended video, waiting for his next login to the site may not pass the candidate builder.


Deep neural Networks for YouTube recommendations This paper is a great read and it's the first time anyone has hit the insider of YouTube's recommended algorithm from the source!! We want to have access to more papers so that we can make better choices when it comes to creating content for this platform. This is the root cause of the willingness to take the time to write these things. After all, the more appropriate content for the platform means more browsing and higher revenue, allowing us to have more resources to produce higher-quality, more compelling content for 1 billion of millions of users.

Reverse the recommendation mechanism from YouTube algorithm paper

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.