[Pattern recognition] RankBoost of Learning To Rank

Source: Internet
Author: User

The concept of RankBoost is relatively simple. It is a common idea of binary Learning to rank: by constructing a target classifier, objects in pair are in a relative size relationship. In layman's terms, a pair of pair objects, such as r1> r2> r3> r4, can constitute pair :( r1, r2) (r1, r3 ), (r1, r4), (r2, r3) (r3, r4), the pair value is positive, that is, the label value is 1, and the remaining pair value is (r2, r1) the value must be-1 or 0. This sorting problem is cleverly converted to classification. Recently, many CV circles have used this learning to rank idea to identify problems (the earliest was this article "Person Re-Identification by Support Vector Ranking"). that is, to convert recognition into sorting and then into classification.

The Pairwise sorting method mainly uses RankSVM and RankBoost. Here we mainly talk about RankBoost, which is a Boost framework as a whole:


Note that the data distribution is also different from that of conventional Boost when the group is updated. Here we can see that for the final sorting value, that is, the ranking score, the value has no practical significance, and the relative order makes sense. For example, the ultimate score of r1 and r2 is 10 and 1, and the final score of r1 and r2 is 100 and 1, the amount of information difference is not big, we can conclude that r1 should be placed before R2.

Unlike the traditional Boost goal, the solution also requires a very clever method, mainly to define the Loss function of the classifier:


Specifically, as well as the loss of distribution D we can get:


Therefore, the goal is minimized.


So far, the traditional Boost linear search strategy has been able to solve, but there are more clever ways. Function:

Therefore, for the x in the range [-1], Z can be approximately:


In this way, the Z hour can be directly used. At this time, it is converted to the problem of maximizing | r |.

The following is a piece of RankBoost code:

function [ rbf ] = RankBoost( X,Y,D,T )%RankBoost implemetation of RankBoost algoritm%   Input:%       X - train set.%       Y - train labels.%       D - distribution function over X times X, it the form of 2D matrix.%       T - number of iteration of the boosting.%   Output:%       rbf - Ranking Function.rbf = RankBoostFunc(T);% w - the current distribution in any iteration, initilize to Dw = D;for t=1:T    tic;    fprintf('RankBoost: creating the function, iteration %d out of %d\n',t,T);    WL = getBestWeakLearner(X,Y,w);    rbf.addWeakLearner(WL,t);    rbf.addAlpha(WL.alpha,t);    alpha=WL.alpha;        %update the distribution    %eval the weak learnler on the set of X and Y    h=WL.eval(X);    [hlen, ~] = size(h);    tmph = (repmat(h,1,hlen) - repmat(h',hlen,1));    w=w.*exp(tmph.*alpha);    %normalize w    w = w./sum(w(:));    toc;endend

One obvious problem is that RankBoost needs to maintain a very large | X | * | X | matrix. The program runs very memory-consuming and often throws an Out of memory error. So

tmph = (repmat(h,1,hlen) - repmat(h',hlen,1));
The following method is not recommended for operations such:

   % tmph = (repmat(h,1,hlen) - repmat(h',hlen,1));    %w=w.*exp(tmph.*alpha);    [rows, cols] = size(w);    sumw = 0;    for r=1:rows        for c=1:cols            w(r,c) = w(r,c)*exp((h(r)-h(c))*alpha);            sumw = sumw + w(r,c);        end    end        %normalize w    %w = w./sum(w(:));    w = w./sumw;




(Reprinted please indicate the author and Source: http://blog.csdn.net/xiaowei_cqu is not allowed for commercial use)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.