Scipy-sparse Module

Last Update:2014-12-06 Source: Internet

Author: User

Tags intel mkl

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

http://blog.csdn.net/pipisorry/article/details/41762945

First, Sparse Matrix Storage Formats

For many sparse matrices with zero element, storing only non-0 elements makes the matrix operation more efficient.

There are many ways to store sparse matrices, but most use the same basic technique, which is to store all non-0 elements of the matrix into a linear array, and provide auxiliary arrays to describe the position of the non-0 elements in the original array.

1. Coordinate Format (COO)

The main advantages of this storage method are flexibility and simplicity. Stores only non-0 elements and coordinates for each non-0 element.

Storage using 3 arrays: values , rows , andcolumn

values: 实数或复数数据，包括矩阵中的非零元素，Order arbitrary.
rows: The row where the data is located.
columns: The column in which the data is located.

The 参数： number of non-0 elements in the matrix nnz,3 the length of the arrays is NNZ.

2. Diagonal Storage Format (DIA)

If the sparse matrix has diagonals containing only zero elements and then the diagonal storage format can used to reduce t He amount of information needed to locate the Non-zero elements. This storage format was particularly useful in many applications where the matrix arises from a finite element or finite di Fference discretization.

The Intel MKL Diagonal storage format is specified by the arrays: values distance and, and both parameters: ndiag , which is the Nu Mber of Non-empty diagonals, and lval , which are the declared leading dimension in the calling (sub) programs.


values: A real or complex two-dimensional array is dimensioned as lval by ndiag . Each column of it contains the non-zero elements of certain diagonal of A . The key point of the storage is, each element in retains the row number of the values original matrix. To achieve this diagonals in the lower triangular part of the matrix is padded from the top, and those in the upper Trian Gular part is padded from the bottom. Note that the value of a () is the number of elements to being distance i padded for diagonal i .

distance: An integer array with dimension ndiag . Element of the array is the i distance distance between i -diagonal and the main diagonal. The distance is positive if the diagonal are above the main diagonal, and negative if the diagonal is below the main Diagon Al. The main diagonal have a distance equal to zero.

3. Compressed Sparse Row Format (CSR)

The Intel MKL compressed sparse row (CSR) format is specified by four arrays:the,, values columns pointerB , and pointerE . The following table describes the arrays in terms of the values, row, and column positions of the Non-zero elements in a s Parse Matrix A .


values: A real or complex array that contains the Non-zero elements of A . Values of the Non-zero elements of is A mapped into the values array using the Row-major storage mapping described ABO Ve.

columns: Element of the integer array is the number of the the column in that i columns contains the A i -th value in the values a Rray.

pointerB: element of this integer array gives the index of the element in the array that is first j values non-zero element in a Row j of A . Note that this index is equal to pointerB(j) -pointerB(1)+1 .

pointerE: An integer array, the contains row indices, such, the "the element in the", the "the" and "the", the "the" and "the" pointerE(j)-pointerB(1) values Zero element in a row j of A .

4. Compressed Sparse Column Format (CSC)

The compressed Sparse column format (CSC) is similar to the CSR format, and the columns is used instead the rows. In other words, the CSC format was identical to the CSR format for the transposed matrix. The CSR format is specified by four arrays:,,, and values columns pointerB pointerE . The following table describes the arrays in terms of the values, row, and column positions of the Non-zero elements in a s Parse Matrix A .


values: A real or complex array that contains the Non-zero elements of A . Values of the Non-zero elements of is A mapped into the values array using the column-major storage mapping.

rows: Element of the integer array is the number of the row in the i rows contains the A i -th value in the values arra Y.

pointerB: element of this integer array gives the index of the element in the array that is first j values non-zero element in a Column j of A . Note that this index is equal to pointerB(j) -pointerB(1)+1 .

pointerE: An integer array, which contains column indices, such is the index of the pointerE(j)-pointerB(1) element in the values array On-zero element in a column j of A .

5. Skyline Storage Format

The Skyline storage format is important for the direct sparse solvers, and it's well suited for Cholesky or LU decomposit Ion when no pivoting is required.

The Skyline storage format accepted in Intel MKL can store is triangular matrix or triangular part of a matrix. This format was specified by arrays: and values pointers . The following table describes these arrays:


values: A scalar array. For a lower triangular matrix it contains the set of elements from each row of the matrix starting from the first Non-zero element to and including the diagonal element. For a upper triangular matrix it contains the set of elements from each column of the matrix starting with the first non- The zero element down to and including the diagonal element. Encountered zero elements is included in the sets.

pointers: An integer array with dimension (m+1) , where are the number of m rows for lower triangle (columns for the upper Triangl e). Gives the index of element in this is first pointers(i) -pointers(1)+1 values Non-zero an element in row (column) i . pointers(m+1) nnz+pointers(1) The value of is set to, where is the number of elements in the nnz array values .

6. Block Compressed Sparse Row Format (BSR)

The Intel MKL block compressed sparse row (BSR) format for sparse matrices are specified by four arrays: values ,,, columns pointerB and pointerE . The following table describes these arrays.


values: A real array that contains the elements of the non-zero blocks of a sparse matrix. The elements is stored block-by-block in row-major order. A Non-zero Block is the block, the contains at least one Non-zero element. All elements of Non-zero blocks was stored, even if some of them is equal to zero. Within each Non-zero block elements is stored in column-major order in the case of one-based indexing, and in Row-major O Rder in the case of the zero-based indexing.

columns: Element of the integer array is the number of the "column in the" the i columns block matrix that contains the i -th Non-zer O block.

pointerB: element of this integer array gives the index of the element in the array, is first j columns Non-zero block in a ro W of the j block matrix.

pointerE: element of this integer array gives the index of the element in the array, contains the last j columns Non-zero blo CK in a row of the j block matrix plus 1.

7. Ellpack (ELL)

8. Hybrid (HYB)

It is made up of two ell+coo formats.

Second, the choice of sparse matrix storage format of some experience:

1. The DIA and ELL formats are the most efficient for sparse matrix-vector product (sparse matrix-vector products), so they are the fastest format for sparse linear systems using iterative methods such as conjugate gradient methods;

2. The COO and CSR formats are more flexible and easy to operate than Dia and Ell.

3. The advantages of ELL are fast, and the COO advantage is flexible, the combination of the HYB format is a good sparse matrix representation format;

4. According to Nathan Bell's work:

The CSR format is most stable for non-0 elements (Bytes per nonzero Entry) when storing sparse matrices (approximately 12.5 of the type float is about 8.5,double)

The average number of bytes used by a non-0 element that stores data in the DIA format has a greater relationship to the matrix type, and is suitable for sparse matrices of structuredmesh structures (about 8.10 of the type float is approximately 4.05,double).

The number of bytes used for unstructured mesh and the random matrix,dia format is more than 10 times times the CSR format;

5. Some linear algebra computing libraries: The COO format is commonly used to read and write sparse matrices from files, such as Matrix market in the COO format , and the CSR format is often used for sparse matrix computations after reading data.

"Sparse Matrix Representations & iterative solvers, Lesson 1 by Nathan Bell"

" sparse linear system Sparse LinearSystems"

"sparse matrix format for use in Intel MKL libraries "

three, sparse matrix different storage forms in the sparse module corresponds to the following:Bsr_matrix (arg1[, Shape, dtype,copy, blocksize]) Block Sparse Row MatrixCoo_matrix (arg1[, Shape, dtype,copy]) A sparse matrix in coordinate format.Csc_matrix (arg1[, Shape, dtype,copy]) compressed Sparse Column matrixCsr_matrix (arg1[, Shape, dtype,copy]) compressed Sparse Row matrixDia_matrix (arg1[, Shape, dtype,copy]) Sparse matrix with DIAgonal storageDok_matrix (arg1[, Shape, dtype,copy]) Dictionary of Keys based sparse matrix.Lil_matrix (arg1[, Shape, dtype,copy]) row-based linked list sparse matrix
Iv. Sparse matrix related operations
To change the ordinary non-sparse matrix into the sparse matrix of corresponding storage form
Take Coo_matrix as an example:1 A =coo_matrix ([[1,2],[3,4]])2 Build The matrix according to the requirements of the corresponding storage form:>>>row = Np.array ([0,0,1,3,1,0,0])>>>col = Np.array ([0,2,1,3,1,0,0])>>>data = Np.array ([1,1,1,1,1,1,1])>>>coo_matrix (data, (Row,col)), shape= (bis)). Todense ()Matrix ([[3, 0, 1,0],[0 , 2, 0,0], [0 , 0, 0,0], [0 , 0, 0,1]]) Merge the sparse matrix horizontally or vertically >>>from scipy.sparse Import Coo_matrix, Vstack>>>a = Coo_matrix ([[1,2],[3,4]])>>>b = Coo_matrix ([[5,6]])>>>vstack ([A, b]). Todense ()Matrix ([[1, 2],[3,4 ], [5,6]]) If the data form A and B are different, they cannot be merged. The data format in a matrix must be the same.
the Diags function can establish sparse diagonal matrices
for the majority of storage formats (which appear to be outside the COO) sparse matrices, slice operations can be performed, such as for CSC,CSR. can also be arithmeticoperations, the matrix of subtraction, fast. Take the specified number of columns of the matrix, such as the 1,3,8 column of the Matrix: matrix[:,[0,2,7]]???
Sparce Matrix reads: Can be read as a regular matrix by subscript. You can also read specific columns or specific rows by GetRow (i), Gecol (i), andnonzero () reads the position of a non-0 element.

Official document of the sparse module: http://docs.scipy.org/doc/scipy/reference/sparse.html

From

http://blog.csdn.net/pipisorry/article/details/41762945

Ref

Http://blog.sina.com.cn/s/blog_6a90ae320101aavg.html

Scipy-sparse Module

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More