The recommended model of ALS matrix decomposition

Last Update:2015-03-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In fact, using the model to predict a user's rating of an item, thought is similar to linear regression to make predictions, roughly as follows

Define a predictive model (mathematical formula),

Then determine a loss function,

Use the existing data as a training set,

Constantly iterating to minimize the value of the loss function,

Finally, the parameters are determined and the parameters are set into the prediction model to make predictions.

The prediction model for matrix decomposition is:

The loss function is:

we just want to minimize the loss function and get the parameter Q and P.

The physical meaning of matrix decomposition model

We want to learn that a P represents the feature of the user, and Q represents the feature of item. Each dimension of the feature represents a recessive factor, such as the film, which may be the director, actor, etc. Of course, these recessive factors are machine-learned, specifically what meaning we are not sure.

After learning P and Q, we can predict all user ratings for item by directly multiplying p by Q.

After the matrix decomposition recommendation model, below to ALS (full name alternatingleast squares). In fact, ALS is a solution to minimize the loss function above, of course, there are other methods such as SGD.

The loss function in the ALS thesis is (slightly different from the one above)

Each iteration,

Fixed m, updating each user's feature u (biased to u for 0 solution).

Fixed u, updating each item's feature m (biased to M for 0 solution).

In this paper, the derivation

This is the formula that asks U for each iteration. The same as M.

For a clearer understanding, this is explained in conjunction with Spark's ALS code.

There are three versions of the ALS implemented in Spark Source, one is Localals.scala (no spark), one is Sparkals.scala (using spark for parallel optimization), and the other is ALS in Mllib.

Originally Localals.scala and Sparkals.scala, the two implementations were official for developers to learn to use spark to show,

The ALS in Mllib can be used for practical recommendations.

However, the ALS in Mllib has been optimized and is not suitable for beginners to understand the ALS algorithm.

So, let me take Localals.scala and Sparkals.scala to explain the ALS algorithm.

Localals.scala

    iteratively Update Movies then the users for    (ITER <-1 to iterations) {      println (S "Iteration $iter:")      ms = ( 0 until M). Map (i = Updatemovie (i, MS (i), US, R)). ToArray  //fixed user, update the feature of all movies one by one      US = (0 until U). Map (j = Updat Euser (J, US (j), MS, R)). ToArray   //fixed movie, update feature of all users individually      println ("RMSE =" + RMSE (R, MS, US))      println ()    }

  Updated character Vector  def updateUser (J:int, U:realvector, Ms:array[realvector], R:realmatrix) for the J User: Realvector = {    var Xtx:realmatrix = new Array2drowrealmatrix (f, f)//f is the number of recessive factors    var xty:realvector = new Arrayrealvector (f)    //For E Ach movie that the user rated iterates through the user-rated movie. Obviously, this user has scored all the movies by default, so it's 0-m. The actual application solution requires only traversing the user-rated movie.    for (i <-0 until m) {      val m = ms (i)      //ADD M * m^t to XtX outer product accumulate to XtX      XtX = Xtx.add (m.outerproduct (M))//vector and the outer product of the vector: one as a column vector, one as a row vector, the matrix multiplication, the result is a matrix      //ADD M * Rating to xty      xty = Xty.add (m.mapmultiply (R.getentry (i, j))) c11/>}    //Add regularization coefficients to diagonal terms    for (d <-0 until F) {      xtx.addtoentry (d, D, LAMBDA * M)    }    //Solve it with Cholesky is actually the solution of a a*x=b equation for    new Choleskydecomposition (XtX). Getsolver.solve ( xty)  }

Combined with the formula in the paper

In fact, the XTX in the code is the part of the red circle on the left side of the formula, and Xty is the part of the right red circle.

Similarly, updating the feature m of each movie is similar and is not repeated here.

Sparkals.scala

    For (ITER <-1 to iterations) {      println (S "Iteration $iter:")      ms = Sc.parallelize (0 until M, slices)                . Map (i = = Update (i, Msb.value (i), Usb.value, Rc.value))                . Collect ()      MSB = Sc.broadcast (ms)//Re-broadcast MS because It was updated      US = sc.parallelize (0 until U, slices)                . Map (i = update (i, Usb.value (i), Msb.value, Rc.value.tra Nspose ()))                . Collect ()      USB = Sc.broadcast (US)/re-broadcast us because it was updated      println ("RMSE =" + RM SE (R, MS, US))      println ()    }

The sparkals version is parallel optimized when compared to the highlights of the localals. Localals, the characteristics of each user are serially updated. In Sparkals, it is updated in parallel.

Resources:

"Large-scale Parallel collaborative Filtering for the Netflix Prize" (ALS-WR original paper)

"Matrix factorization Techniques for recommender Systems" (good material for matrices decomposition models)

Https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/LocalALS.scala

Https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/SparkALS.scala

This article linger

This article link: http://blog.csdn.net/lingerlanlan/article/details/44085913

The recommended model of ALS matrix decomposition

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The recommended model of ALS matrix decomposition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The recommended model of ALS matrix decomposition

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support