The main contents of this section
- Indexedrowmatrix
- Blockmatrix
1. Use of Indexedrowmatrix
Indexedrowmatrix, as the name implies is an indexed Rowmatrix, which uses the case class Indexedrow (Index:long, Vector:vector) class to represent a row of the Matrix, Index is its index, The vector represents what it wants to store. It is used in the following ways:
Package CN. ML. Datastructimport org. Apache. Spark. Sparkconfimport org. Apache. Spark. Sparkcontextimport org. Apache. Spark. Mllib. Linalg. Vectorsimport org. Apache. Spark. Mllib. Linalg. Distributed. Rowmatriximport org. Apache. Spark. Mllib. Linalg. Distributed. Coordinatematriximport org. Apache. Spark. Mllib. Stat. Multivariatestatisticalsummaryimport org. Apache. Spark. Mllib. Linalg. Matriximport org. Apache. Spark. Mllib. Linalg. Singularvaluedecompositionimport org. Apache. Spark. Mllib. Linalg. Matricesimport org. Apache. Spark. Mllib. Linalg. Distributed. Indexedrowimport org. Apache. Spark. Mllib. Linalg. Distributed. IndexedrowmatrixObject Indexrowmatrixdemo extends App {val sparkconf = new sparkconf (). Setappname("Indexrowmatrixdemo"). Setmaster("spark://sparkmaster:7077"Val sc = new Sparkcontext (sparkconf)//define an implicit conversion function implicit def double2long (x:D ouble) =x. TolongThe first element in the data is index in Indexedrow, and the remaining maps to the vector//f. Take(1)(0Gets the first element and automatically converts it to a long type Val rdd1= SC. Parallelize(Array (1.0,2.0,3.0,4.0), Array (2.0,3.0,4.0,5.0), Array (3.0,4.0,5.0,6.0) ) ). Map(f = Indexedrow (f. Take(1)(0), Vectors. Dense(f. Drop(1))) Val Indexrowmatrix = new Indexedrowmatrix (RDD1)//Calculate the pull matrix var Gramianmatrix:matrix=indexrowmatrix. Computegramianmatrix()//convert rows Matrix rowmatrix var Rowmatrix:rowmatrix=indexrowmatrix. Torowmatrix()//other methods such as COMPUTESVD compute singular value, multiply matrix multiplication and other operations, using the same method as Rowmaxtrix}
2. Use of Blockmatrix
The chunking matrix divides a matrix into blocks, for example:
It can be divided into four pieces.
Thus the matrix P has the following form
More relevant content of the block matrix includes the transpose of the block matrix and the multiplication of the block matrix. See Https://en.wikipedia.org/wiki/Block_matrix
Package CN. ML. Datastructimport org. Apache. Spark. Mllib. Linalg. Distributed. Blockmatriximport org. Apache. Spark. Mllib. Linalg. Distributed. Coordinatematriximport org. Apache. Spark. Mllib. Linalg. Distributed. Matrixentryimport org. Apache. Spark. Mllib. Linalg. Distributed. Indexedrowmatriximport org. Apache. Spark. Sparkcontextimport org. Apache. Spark. Mllib. Linalg. Distributed. Indexedrowimport org. Apache. Spark. Mllib. Linalg. Vectorsimport org. Apache. Spark. SparkconfObject Blockmatrixdemo extends App {val sparkconf = new sparkconf (). Setappname("Blockmatrixdemo"). Setmaster("spark://sparkmaster:7077")//here refers to running locally,2A thread val sc = new Sparkcontext (sparkconf) implicit def double2long (x:D ouble) =x. TolongVal rdd1= SC. Parallelize(Array (1.0,20.0,30.0,40.0), Array (2.0,50.0,60.0,70.0), Array (3.0,80.0,90.0,100.0) ) ). Map(f = Indexedrow (f. Take(1)(0), Vectors. Dense(f. Drop(1))) Val Indexrowmatrix = new Indexedrowmatrix (RDD1)//convert Indexedrowmatrix to Blockmatrix, specify the number of rows per block Val Blockmatrix:bloc Kmatrix=indexrowmatrix. Toblockmatrix(2,2)//After the execution of the printed content://index: (0,0) Matrixcontent:2 x 2Cscmatrix//(1,0)20.0//(1,1)30.0Index: (1,1) Matrixcontent:2 x 1Cscmatrix//(0,0)70.0//(1,0)100.0Index: (1,0) Matrixcontent:2 x 2Cscmatrix//(0,0)50.0//(1,0)80.0//(0,1)60.0//(1,1)90.0Index: (0,1) Matrixcontent:2 x 1Cscmatrix//(1,0)40.0From the printed content can be seen: each block matrix using the sparse matrix CSC format Storage Blockmatrix. Blocks. foreach(F=>println ("Index:"+f._1+"Matrixcontent:"+f._2))//conversion cost to matrix//0.0 0.0 0.0//20.0 30.0 40.0//50.0 60.0 70.0//80.0 90.0 100.0As can be seen from the converted content, the Indexrowmatrix. Toblockmatrix(2,2)//operation, when the specified number of rows does not match the actual matrix content, the corresponding 0-value padding is made Blockmatrix. Tolocalmatrix()//block matrix addition Blockmatrix. Add(Blockmatrix)//block matrix multiplication blockmatrix*blockmatrix^t (T means transpose) Blockmatrix. Multiply(Blockmatrix. Transpose)//Convert to Coordinatematrix Blockmatrix. Tocoordinatematrix()//convert to Indexedrowmatrix Blockmatrix. Toindexedrowmatrix()//Verify the legitimacy of the block matrix Blockmatrix. Validate()}
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Machine learning on spark--section II: Basic data Structure (II)