Machine Learning Python implements SVD decomposition

Last Update:2015-03-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article mainly integrates the recommendation algorithm and SVD based on machine learning practices.

Any matrix can be decomposed into SVD forms.

In fact, SVD means to map data using feature space conversion. Next we will introduce the basic concepts of SVD. First we will give python, and here we will first give a simple matrix, relationship between users and items

I have some questions here?

For such a DATA = U (Z) Vt

Here, the true geometric meaning of U and V: the meaning in the book is that U maps items to the new feature space, and V's transpose maps users to the new feature space.

The following is code implementation. At the same time, SVD can also be used for dimensionality reduction. The dimensionality reduction operation is to compare the singular values by retaining the value.

#-*-Coding: cp936-*-''' Created on Mar 8, 2011 @ author: peter ''' from numpy import * from numpy import linalg as la # alias used # Here, we will introduce SVD in combination with the recommendation system, therefore, the data here can be regarded as a user's score for an item def loadExData (): return [0, 0, 0, 2, 2], [0, 0, 0, 0, 3, 3], [0, 0, 0, 1, 1], [1, 1, 1, 0, 0], [2, 2, 2, 0, 0], [5, 5, 5, 0, 0], [1, 1, 1, 0, 0] def loadExData2 (): return [0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 5], [0, 0, 0, 3, 0, 4, 0, 0, 0, 0, 0, 3], [0, 0, 0, 0, 4, 0, 0, 1, 0, 4, 0], [3, 3, 4, 0, 0, 0, 0, 0, 2, 2, 0, 0], [5, 4, 5, 0, 0, 0, 0, 5, 5, 0, 0], [0, 0, 0, 0, 5, 0, 1, 0, 0, 5, 0], [4, 3, 4, 0, 0, 0, 0, 5, 5, 0, 1], [0, 0, 0, 4, 0, 4, 0, 0, 0, 0, 4], [0, 0, 0, 2, 0, 2, 5, 0, 0, 1, 2], [0, 0, 0, 0, 5, 0, 0, 0, 4, 0], [1, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0] def ecludSim (inA, inB): return 1.0/(1.0 + la. norm (inA-inB) # The second paradigm of the computed vector is equivalent to directly calculating the Euclidean distance def pearsSim (inA, inB): if len (inA) <3: return 1.0 return 0.5 + 0.5 * using coef (inA, inB, rowvar = 0) [0] [1] # using coef directly calculates the Pearson correlation coefficient def cosSim (inA, inB ): num = float (inA. T * inB) denom = la. norm (inA) * la. norm (inB) return 0.5 + 0.5 * (num/denom) # cosine similarity calculation # collaborative filtering algorithm # dataMat user data user simMeas similarity calculation method item def standEst (dataMat, user, simMeas, item): n = shape (dataMat) [1] # calculate the number of columns, number of items simTotal = 0.0; ratSimTotal = 0.0 for j in range (n ): userRating = dataMat [user, j] print (dataMat [user, j]) if userRating = 0: continue # if user u does not score item j, the overLap = nonzero (logical_and (dataMat [:, item]. a> 0, \ dataMat [:, j]. a> 0) [0] # Find the user if len (overLap) = 0: similarity = 0 else who has played too much on item j and item: similarity = simMeas (dataMat [overLap, item], dataMat [overLap, j]) # Calculate the similarity between two items using similarity print 'the % d and % d similarity is: % F' % (item, j, similarity) simTotal + = similarity ratSimTotal + = similarity * userRating # similarity between items to be recommended and over-rated by users * if simTotal = 0: return 0 else: return ratSimTotal/simTotal # Use SVD for decomposition, but here is the function in the library # If you implement an SVD decomposition by yourself, I think it is the same as the solution knowledge in matrix theory, but it may be painful to find the feature value def svdEst (dataMat, user, simMeas, item ): n = shape (dataMat) [1] simTotal = 0.0; ratSimTotal = 0.0 U, Sigma, VT = la. svd (dataMat) # Break Down Sig4 = mat (eye (4) * Sigma [: 4]) # arrange Sig4 into a diagonal matrix xformedItems = dataMat. T * U [:,: 4] * Sig4. I # create transformed items for j in range (n): userRating = dataMat [user, j] if userRating = 0 or j = item: continue similarity = simMeas (xformedItems [item,:]. t, \ xformedItems [j,:]. t) print 'the % d and % d similarity is: % F' % (item, j, similarity) simTotal + = similarity ratSimTotal + = similarity * userRating if simTotal = 0: return 0 else: return ratSimTotal/simTotal # True recommendation function. The next two functions are the similarity calculation method and the recommended method def recommend (dataMat, user, N = 3, simMeas = cosSim, estMethod = standEst): unratedItems = nonzero (dataMat [user,:]. A = 0) [1] # find unrated items nonzero () [1] returns the number of rows with A non-zero value, and returns A tuple if len (unratedItems) = 0: return 'you rated everything 'itemScores = [] for item in unratedItems: www.bkjia.com estimatedScore = estMethod (dataMat, user, simMeas, item) itemScores. append (item, estimatedScore) return sorted (itemScores, key = lambda jj: jj [1], reverse = True) [: N] # example of extension, use SVD for Image Compression # print the image def printMat (inMat, thresh = 0.8): for I in range (32): for k in range (32 ): if float (inMat [I, k])> thresh: print 1, else: print 0, print ''# The reconstructed Data graph is similar to def imgCompress (numSV = 3, thresh = 0.8): myl = [] for line in open('0_5.txt '). readlines (): newRow = [] for I in range (32): newRow. append (int (line [I]) myl. append (newRow) myMat = mat (myl) # Read data into myMat and print "***** original matrix ******" printMat (myMat, thresh) U, sigma, VT = la. svd (myMat) SigRecon = mat (zeros (numSV, numSV) # construct a 3*3 empty matrix for k in range (numSV ): # construct diagonal matrix from vector SigRecon [k, k] = Sigma [k] reconMat = U [:,: numSV] * SigRecon * VT [: numSV,:] print "***** reconstructed matrix using % d singular values *******" % numSV printMat (reconMat, thresh)

The results show that the images before and after dimensionality reduction are similar.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More