Preference Learning--object ranking:learning to Order things

Source: Internet
Author: User

This paper is an article published by Cohen1999 in artificial Intelligence (Class A), which addresses the problem of object ranking.

Abstract

In inductive learning (inductive learning) The most attention is the problem of classification learning , but in fact, there is a kind of ranking learning problem is also very important. A ranking model can be constructed based on a probabilistic classifier model or a regression model. Ranking tasks are easier than sorting tasks because preference information is easier to get than labels. To give a few examples, according to the user characteristics of the e-mail messages in the personalized ranking, waiting for users to read. According to the user's rating information on the film, the film is ranked and then generated a referral list feedback to the user. For example, the information retrieval based on the relevance of the query to the page ranking. In the recommendation system, according to the user's rating information on the product, the product ranking, in fact, the user's rating information is actually a preference relationship (but each user's rating represents a different degree of preference).

Define notations:
    • Object ranking: The training data set is a preference relationship between samples, shaped like v precedence over U, and the sample has no class tag with only characteristic data.
    • X: Sample Set {X1,x2...xn},n represents the number of samples.
    • F (v): F is the sort function (ordering functions), F (U) >f (v) indicates that u is in front of v. If f (u) is an orthogonal symbol (⊥), it means that you cannot sort the U.
    • RF:RF is a preference function (preference functions), derived from the F function. If the Rf (u,v) =1 indicates a preference for U,RF (U,V) =0 the preferred v. When RF (U,V) =1/2 indicates that u and V cannot be compared.
    • PREF (U,V): The Weighted preference function (preference functions), which takes a value in the [0,1] interval. The closer the value of PREF (u,v) to 1 means that you are more confident that you are in front of the V, and closer to 0 means that you are more confident that the V is in front of u, and that the order of the U and V is not determined when you are 1/2.

Look at an abstract example where both F and g are sort functions, and the instance are sorted separately. Convert F and G to RF and RG, and then combine them linearly to generate pref to get a weighted preference function.

Take a specific example: given a collection of documents X, the attributes of each document are words, and the value is the frequency of the occurrence of the word, with a total of n attributes {W1,w2...wn}. Fi (u) indicates how often the first attribute of document U is occurring. Then, the RFI will sort the document according to the size of the I attribute. But the importance of each word is different, so it needs to be weighted to find the final ranking.

A more specific example: a meta-search application (a metasearch application) is aimed at a given query, combining the results of several search engines and then ranking the pages. For example, there are n search engine e1,e2...en. Li represents the list of page rankings that EI gives. Fi (U) =-k indicates that EI will rank page u as k,fi (u) =-m (m>| li|) indicates that EI will not appear on the Li page with the rank set to W.

Linear Combination

Suppose we have several ranking experts (ranking experts), each ranking expert generates a ranking function. The weight of each ranking expert WI is incrementally updated. Assuming that the learning process iterates over T, each input training set is XT, each EI gives a sort function of FTI (the sort function given by the T-iteration ei), at which time the XT contains all LTI (the leaderboard of the T-iteration ei). The RTI is then generated (the first I-preference function of the T-iteration), and a loss function is computed.

    • Where F is a feedback message:

There are two types of feedback generation methods:
1. The only relevant page takes precedence over all pages
2. By collecting a user's click data, the relevant page is considered to take precedence over all the pages in front of it.

Feedback Other ways:
1. Direct: By asking the user to rank this ranking expert generated page rank again
2. Indirect: The page is re-ranked by the time the user stays on the page

    • This loss function represents the probability that the generated r is different from the feedback information.

Then using the Hedge algorithm algorithm, the algorithm maintains a positive weight vector wt= (WT1.WT2...WTN), w1i initialized to 1/n. The weight of each ranking expert is the same when the initial state is expressed. Then each time the weighted preference function pref:

    • The iteration formula for W is as follows:

Where β is a parameter between 0 and 1, ZT is a normalized parameter that causes W to update the ownership value and 1.

    • The pseudo-code of the PREF is computed using the hedge algorithm:

Ordering Instances
    • In the hedge algorithm need to calculate a rank p, then the first point is how to measure the quality of this ranking, so put forward an indicator AGREE:

    • But finding an idealized ranking makes agree the biggest problem is a np-complete problem. Therefore, this paper proposes a greedy algorithm greedy ordering algorithm:

    • Algorithm performance:

In this algorithm, PREF can be regarded as a weighted graph, where the weight of the Edge (U,V) is pref (u,v). It can be found that the π () function is actually pref-out and that the ranking is reduced in turn (P (v) =| V|, while v is decreasing).

    • Look at a specific example:

      π (b) = 2,π (d) = 3/2,π (c) = -5/4,π (a) = -9/4========>p (b) =4
      π (d) = 3/2,π (c) = -1/4,π (a) = -5/4========>p (d) =3
      π (c) = 1/2, π (a) = -1/2========>p (c) =2,p (a) =1
      Rank order: B > D > C > A.
Improvements
    • See an example:

The left side is the real pref relationship (2k+2 a preference relationship)
On the right is the sort relationship based on the greedy algorithm, and there are k+2 that satisfy the real preference relationship.
Approximate degree: 1/2

    • Improved algorithm idea: find strong connected graphs in graphs

This algorithm makes a slight adjustment to the pref, for the two edges between the two nodes in the graph, removes the smaller of pref (u,v) and Pref (V,u), and modifies the value of the larger edge as | PREF (V,u)-
PREF (u,v) |, when PREF (u,v) =pref (v,u) =1/2, two edges are removed.

    • Look at a specific example:

Solution steps:
1. Adjust the pref to get a simplified diagram
2. Find the strongly connected graphs in the diagram: {a,c,d} and {B}
3. first sort the connected graph (as long as the Connection diagram 1 has an edge pointing to connected Figure 2, then 1 is preferred): {b}>{a,c,d}
4. sort the interior of each connected graph (using the previous greedy algorithm): C>d>a
The general sort is b>c>d>a.

Experiments Set
    • Perform two sets of experiments:

1.small graphs: Due to the real sorting problem, the optimal solution cannot be obtained, so some real solutions can be used to experiment with the small graph obtained by the Brute force solution. The standard of measurement is the gap between the experimental rankings and the real rankings.
2.large graphs: The true solution is unknown, and the metrics depend on the total weight of all sides.

    • Algorithm comparison:

In addition to the most basic greedy algorithm and the improved greedy algorithm, there is a random algorithm used as a reference: The algorithm is randomly generated an arrangement, and then output this arrangement or its reverse order in a better sort (optional). In fact, the stochastic algorithm through a large number of experiments, can find the optimal solution, of course, in practical applications this is not feasible (brute force solution).

Small Graphs Experiments
    • Experiment Settings

Each graph has only nine or fewer nodes, but the graph of each node number generates 10,000 random graphs and randomly generates PREF (U,V), PREF (v,u) =1-pref (u,v), and for the stochastic algorithm, evaluates the average performance of 10n random permutations (n is the number of nodes).

    • Evaluation method

The p in the molecule is the ranking of the experimental predictions, and the denominator p is the real ranking. The result of this equation is the prediction of the performance evaluation method, the larger the more the closer the real solution.

    • Experimental results

Calculate the number of nodes with the same random figure (10,000) of the calculated measures to average, and then get the following experimental results diagram.
When the number of nodes is greater than 5 o'clock, the performance of the greedy algorithm is better than that of the stochastic algorithm and the time complexity is very low.

Large Graphs Experiments
    • Evaluation method

The numerator is the difference between the pref and the denominator which is the difference between the pref of all node pairs in the predicted rank relationship.

    • Experimental results

It can be found that the greedy algorithm and the improved greedy algorithm almost have the same performance, mainly because the large random graph is almost a strong connected graph. And we can see that the greedy algorithm is better than the stochastic algorithm.

Experimental Results for Metasearch

This is a real ranking problem: Meta search engine.
Previously described Pref, did not tell how to ask for pref, and then discussed based on the Pref solution rank, now through a practical example to learn pref, and through the pref to rank instances.

Preference Learning--object ranking:learning to Order things

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.