Storage format for sparse matrices

Source: Internet
Author: User
Tags intel mkl

This article goes from a blog post.

More detailed information can be downloaded from Baidu Cloud.

For many sparse matrices with zero element, storing only non-0 elements makes the matrix operation more efficient. There are many ways to store sparse matrices, but most use the same basic technique, which is to store all non-0 elements of the matrix into a linear array, and provide auxiliary arrays to describe the position of the non-0 elements in the original array.

Here are a few common sparse matrix storage formats:

1. Coordinate Format (COO)

  

The main advantages of this storage method are flexibility and simplicity. Stores only non-0 elements and coordinates for each non-0 element.

Storage using 3 arrays: values , rows , andcolumn

  values: 实数或复数数据,包括矩阵中的非零元素,Order arbitrary.
  rows: The row where the data is located.
  columns: The column in which the data is located.

This can be represented by a struct .

2. Diagonal Storage Format (DIA)

  

If the sparse matrix has diagonals containing only zero elements and then the diagonal storage format can used to reduce t He amount of information needed to locate the Non-zero elements. This storage format was particularly useful in many applications where the matrix arises from a finite element or finite di Fference discretization.

The Intel MKL Diagonal storage format is specified by the arrays: values distance and, and both parameters: ndiag , which is the Nu Mber of Non-empty diagonals, and lval , which are the declared leading dimension in the calling (sub) programs.

   values

A real or complex two-dimensional array is dimensioned as lval by ndiag . Each column of it contains the non-zero elements of certain diagonal of A . The key point of the storage is, each element in retains the row number of the values original matrix. To achieve this diagonals in the lower triangular part of the matrix is padded from the top, and those in the upper Trian Gular part is padded from the bottom. Note that the value of a () is the number of elements to being distance i padded for diagonal i .

   distance

An integer array with dimension ndiag . Element of the array is the i distance distance between i -diagonal and the main diagonal. The distance is positive if the diagonal are above the main diagonal, and negative if the diagonal is below the main Diagon Al. The main diagonal have a distance equal to zero.

3. Compressed Sparse Row Format (CSR)

  

The Intel MKL compressed sparse row (CSR) format is specified by four arrays:the,, values columns pointerB , and pointerE . The following table describes the arrays in terms of the values, row, and column positions of the Non-zero elements in a s Parse Matrix A .

   values

A real or complex array that contains the Non-zero elements of A . Values of the Non-zero elements of is A mapped into the values array using the Row-major storage mapping described above .

   columns

Element of the integer array is the number of the column in that i columns contains the A i -th value values in the Array.

   pointerB

element of this integer array gives the index of the element in the array that is first j values non-zero element in a Row j of A . Note that this index is equal to pointerB(j) -pointerB(1)+1 .

   pointerE

An integer array, the contains row indices, such, the "the element in the", the "the" and "the", the "the" and "the" pointerE(j)-pointerB(1) values Zero element in a row j of A .

4. Compressed Sparse Column Format (CSC)

The compressed Sparse column format (CSC) is similar to the CSR format, and the columns is used instead the rows. In other words, the CSC format was identical to the CSR format for the transposed matrix. The CSR format is specified by four arrays:,,, and values columns pointerB pointerE . The following table describes the arrays in terms of the values, row, and column positions of the Non-zero elements in a s Parse Matrix A .

   values

A real or complex array that contains the Non-zero elements of A . Values of the Non-zero elements of is A mapped into the values array using the column-major storage mapping.

   rows

Element of the integer array is the number of the row in the i rows contains the A i -th value in the values ar Ray.

   pointerB

element of this integer array gives the index of the element in the array that is first j values Non-zero element in a C Olumn j of A . Note that this index is equal to pointerB(j) -pointerB(1)+1 .

   pointerE

An integer array, which contains column indices, such is the index of the pointerE(j)-pointerB(1) element in the values array On-zero element in a column j of A .

5. Skyline Storage Format

The Skyline storage format is important for the direct sparse solvers, and it's well suited for Cholesky or LU decomposit Ion when no pivoting is required.

The Skyline storage format accepted in Intel MKL can store is triangular matrix or triangular part of a matrix. This format was specified by arrays: and values pointers . The following table describes these arrays:

   values

A scalar array. For a lower triangular matrix it contains the set of elements from each row of the matrix starting from the first Non-zero element to and including the diagonal element. For a upper triangular matrix it contains the set of elements from each column of the matrix starting with the first non- The zero element down to and including the diagonal element. Encountered zero elements is included in the sets.

   pointers

An integer array with dimension (m+1) , where are the number of m rows for lower triangle (columns for the upper Triangl e). Gives the index of element in this is first pointers(i) -pointers(1)+1 values Non-zero an element in row (column) i . pointers(m+1) nnz+pointers(1) The value of is set to, where is the number of elements in the nnz array values .

6. Block Compressed Sparse Row Format (BSR)

The Intel MKL block compressed sparse row (BSR) format for sparse matrices are specified by four arrays: values , columns , and pointerE . The following table describes these arrays.

   values

A real array that contains the elements of the non-zero blocks of a sparse matrix. The elements is stored block-by-block in row-major order. A Non-zero Block is the block, the contains at least one Non-zero element. All elements of Non-zero blocks was stored, even if some of them is equal to zero. Within each Non-zero block elements is stored in column-major order in the case of one-based indexing, and in Row-major O Rder in the case of the zero-based indexing.

   columns

Element of the integer array is the number of the "column in the" the i columns block matrix that contains the i -th Non-zer O block.

   pointerB

element of this integer array gives the index of the element in the array, is first j columns Non-zero block in a row of the j block matrix.

   pointerE

element of this integer array gives the index of the element in the array that contains the last j columns Non-zero block In a row of the j block matrix plus 1.

7. Ellpack (ELL)

  

8. Hybrid (HYB)

  

It is made up of two ell+coo formats.

Some experience in choosing a sparse matrix storage format:

1. The DIA and ELL formats are the most efficient for sparse matrix-vector product (sparse matrix-vector products), so they are the fastest format for sparse linear systems using iterative methods such as conjugate gradient methods;

2. The COO and CSR formats are more flexible and easy to operate than Dia and Ell.

3. The advantages of ELL are fast, and the COO advantage is flexible, the combination of the HYB format is a good sparse matrix representation format;

4. According to Nathan Bell's work, the CSR format uses the average number of bytes (Bytes per nonzero Entry) for non-0 elements when storing sparse matrices The most stable (float type is about 8.5,double type is about 12.5), and the DIA format stores data in the average number of bytes used by non-0 elements and matrix type has a greater relationship, suitable for the STRUCTUREDMESH structure of the sparse matrix (float type about 4.05,double type about 8.10 ), the number of bytes used for unstructured mesh and the random matrix,dia format is more than 10 times times the CSR format;

5. From some of the linear algebraic computing libraries I have used, the COO format is commonly used to read and write sparse matrices from files, such as Matrix market, which uses the COO format, and the CSR format is often used for sparse matrix computations after reading data.

Storage format for sparse matrices (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.