Machine Learning Python implements SVD decomposition

Source: Internet
Author: User

Machine Learning Python implements SVD decomposition

This article mainly integrates the recommendation algorithm and SVD based on machine learning practices.

Any matrix can be decomposed into SVD forms.

In fact, SVD means to map data using feature space conversion. Next we will introduce the basic concepts of SVD. First we will give python, and here we will first give a simple matrix, relationship between users and items

I have some questions here?

For such a DATA = U (Z) Vt

Here, the true geometric meaning of U and V: the meaning in the book is that U maps items to the new feature space, and V's transpose maps users to the new feature space.

 

The following is code implementation. At the same time, SVD can also be used for dimensionality reduction. The dimensionality reduction operation is to compare the singular values by retaining the value.

 

#-*-Coding: cp936-*-''' Created on Mar 8, 2011 @ author: peter ''' from numpy import * from numpy import linalg as la # alias used # Here, we will introduce SVD in combination with the recommendation system, therefore, the data here can be regarded as a user's score for an item def loadExData (): return [0, 0, 0, 2, 2], [0, 0, 0, 0, 3, 3], [0, 0, 0, 1, 1], [1, 1, 1, 0, 0], [2, 2, 2, 0, 0], [5, 5, 5, 0, 0], [1, 1, 1, 0, 0] def loadExData2 (): return [0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 5], [0, 0, 0, 3, 0, 4, 0, 0, 0, 0, 0, 3], [0, 0, 0, 0, 4, 0, 0, 1, 0, 4, 0], [3, 3, 4, 0, 0, 0, 0, 0, 2, 2, 0, 0], [5, 4, 5, 0, 0, 0, 0, 5, 5, 0, 0], [0, 0, 0, 0, 5, 0, 1, 0, 0, 5, 0], [4, 3, 4, 0, 0, 0, 0, 5, 5, 0, 1], [0, 0, 0, 4, 0, 4, 0, 0, 0, 0, 4], [0, 0, 0, 2, 0, 2, 5, 0, 0, 1, 2], [0, 0, 0, 0, 5, 0, 0, 0, 4, 0], [1, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0] def ecludSim (inA, inB): return 1.0/(1.0 + la. norm (inA-inB) # The second paradigm of the computed vector is equivalent to directly calculating the Euclidean distance def pearsSim (inA, inB): if len (inA) <3: return 1.0 return 0.5 + 0.5 * using coef (inA, inB, rowvar = 0) [0] [1] # using coef directly calculates the Pearson correlation coefficient def cosSim (inA, inB ): num = float (inA. T * inB) denom = la. norm (inA) * la. norm (inB) return 0.5 + 0.5 * (num/denom) # cosine similarity calculation # collaborative filtering algorithm # dataMat user data user simMeas similarity calculation method item def standEst (dataMat, user, simMeas, item): n = shape (dataMat) [1] # calculate the number of columns, number of items simTotal = 0.0; ratSimTotal = 0.0 for j in range (n ): userRating = dataMat [user, j] print (dataMat [user, j]) if userRating = 0: continue # if user u does not score item j, the overLap = nonzero (logical_and (dataMat [:, item]. a> 0, \ dataMat [:, j]. a> 0) [0] # Find the user if len (overLap) = 0: similarity = 0 else who has played too much on item j and item: similarity = simMeas (dataMat [overLap, item], dataMat [overLap, j]) # Calculate the similarity between two items using similarity print 'the % d and % d similarity is: % F' % (item, j, similarity) simTotal + = similarity ratSimTotal + = similarity * userRating # similarity between items to be recommended and over-rated by users * if simTotal = 0: return 0 else: return ratSimTotal/simTotal # Use SVD for decomposition, but here is the function in the library # If you implement an SVD decomposition by yourself, I think it is the same as the solution knowledge in matrix theory, but it may be painful to find the feature value def svdEst (dataMat, user, simMeas, item ): n = shape (dataMat) [1] simTotal = 0.0; ratSimTotal = 0.0 U, Sigma, VT = la. svd (dataMat) # Break Down Sig4 = mat (eye (4) * Sigma [: 4]) # arrange Sig4 into a diagonal matrix xformedItems = dataMat. T * U [:,: 4] * Sig4. I # create transformed items for j in range (n): userRating = dataMat [user, j] if userRating = 0 or j = item: continue similarity = simMeas (xformedItems [item,:]. t, \ xformedItems [j,:]. t) print 'the % d and % d similarity is: % F' % (item, j, similarity) simTotal + = similarity ratSimTotal + = similarity * userRating if simTotal = 0: return 0 else: return ratSimTotal/simTotal # True recommendation function. The next two functions are the similarity calculation method and the recommended method def recommend (dataMat, user, N = 3, simMeas = cosSim, estMethod = standEst): unratedItems = nonzero (dataMat [user,:]. A = 0) [1] # find unrated items nonzero () [1] returns the number of rows with A non-zero value, and returns A tuple if len (unratedItems) = 0: return 'you rated everything 'itemScores = [] for item in unratedItems: www.bkjia.com estimatedScore = estMethod (dataMat, user, simMeas, item) itemScores. append (item, estimatedScore) return sorted (itemScores, key = lambda jj: jj [1], reverse = True) [: N] # example of extension, use SVD for Image Compression # print the image def printMat (inMat, thresh = 0.8): for I in range (32): for k in range (32 ): if float (inMat [I, k])> thresh: print 1, else: print 0, print ''# The reconstructed Data graph is similar to def imgCompress (numSV = 3, thresh = 0.8): myl = [] for line in open('0_5.txt '). readlines (): newRow = [] for I in range (32): newRow. append (int (line [I]) myl. append (newRow) myMat = mat (myl) # Read data into myMat and print "***** original matrix ******" printMat (myMat, thresh) U, sigma, VT = la. svd (myMat) SigRecon = mat (zeros (numSV, numSV) # construct a 3*3 empty matrix for k in range (numSV ): # construct diagonal matrix from vector SigRecon [k, k] = Sigma [k] reconMat = U [:,: numSV] * SigRecon * VT [: numSV,:] print "***** reconstructed matrix using % d singular values *******" % numSV printMat (reconMat, thresh)

The results show that the images before and after dimensionality reduction are similar.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.