Python implements sparse matrix sample code and python matrix sample code

Source: Internet
Author: User

Python implements sparse matrix sample code and python matrix sample code

In engineering practice, in most cases, large matrices are usually sparse matrices, so it is very important to process sparse matrices. This article uses the implementation in Python as an example to first discuss how the sparse matrix stores the representation.

1. Exploring the sparse Module

In python, The scipy module is called the sparse module, which is generated specifically to solve the sparse matrix. Most of the content in this article is actually based on the sparse module.

The first step is to import the sparse module.

>>> from scipy import sparse

Then help. Let's take a look at the rough

>>> help(sparse)

Find out what we are most concerned about:

  Usage information  =================  There are seven available sparse matrix types:    1. csc_matrix: Compressed Sparse Column format    2. csr_matrix: Compressed Sparse Row format    3. bsr_matrix: Block Sparse Row format    4. lil_matrix: List of Lists format    5. dok_matrix: Dictionary of Keys format    6. coo_matrix: COOrdinate format (aka IJV, triplet format)    7. dia_matrix: DIAgonal format  To construct a matrix efficiently, use either dok_matrix or lil_matrix.  The lil_matrix class supports basic slicing and fancy  indexing with a similar syntax to NumPy arrays. As illustrated below,  the COO format may also be used to efficiently construct matrices.  To perform manipulations such as multiplication or inversion, first  convert the matrix to either CSC or CSR format. The lil_matrix format is  row-based, so conversion to CSR is efficient, whereas conversion to CSC  is less so.  All conversions among the CSR, CSC, and COO formats are efficient,  linear-time operations.

Through this description, we have a general understanding of the sparse module. The sparse module provides seven methods for storing sparse matrices. Next, we will introduce these seven methods one by one.

2. coo_matrix

Coo_matrix is the simplest storage method. The row, col, and data arrays are used to store information of non-zero elements. These three arrays have the same length. row stores the row of the element, col stores the column of the element, and data stores the value of the element. In general, coo_matrix is mainly used to create a matrix, because coo_matrix cannot add, delete, modify, or perform other operations on the matrix elements. Once the matrix is created successfully, it will be converted to another form of matrix.

>>> row = [2,2,3,2]>>> col = [3,4,2,3]>>> c = sparse.coo_matrix((data,(row,col)),shape=(5,6))>>> print c.toarray()[[0 0 0 0 0 0] [0 0 0 0 0 0] [0 0 0 5 2 0] [0 0 3 0 0 0] [0 0 0 0 0 0]]

Note that when using coo_matrix to create a matrix, the same column and column coordinates can appear multiple times. After the matrix is created, the corresponding coordinate values are combined to obtain the final result.

3. dok_matrix and lil_matrix

Dok_matrix and lil_matrix are applicable to gradually adding matrix elements. The doc_matrix policy uses dictionaries to record non-zero elements in the matrix. Naturally, the dictionary key stores the ancestor of the location information of the record element, and value is the specific value of the record element.

>>> import numpy as np>>> from scipy.sparse import dok_matrix>>> S = dok_matrix((5, 5), dtype=np.float32)>>> for i in range(5):...   for j in range(5):...       S[i, j] = i + j...>>> print S.toarray()[[ 0. 1. 2. 3. 4.] [ 1. 2. 3. 4. 5.] [ 2. 3. 4. 5. 6.] [ 3. 4. 5. 6. 7.] [ 4. 5. 6. 7. 8.]]

Lil_matrix uses two lists to store non-zero elements. Data stores non-zero elements in each row, and rows stores the columns of non-zero elements. This format is also suitable for adding elements one by one and getting row-related data quickly.

>>> from scipy.sparse import lil_matrix>>> l = lil_matrix((6,5))>>> l[2,3] = 1>>> l[3,4] = 2>>> l[3,2] = 3>>> print l.toarray()[[ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 1. 0.] [ 0. 0. 3. 0. 2.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.]]>>> print[[] [] [1.0] [3.0, 2.0] [] []]>>> print l.rows[[] [] [3] [2, 4] [] []]

From the analysis above, we can easily see that the above two methods of building a sparse matrix are generally used to build a matrix by gradually adding non-zero elements, then it is converted into other matrix storage methods that can be quickly computed.

4. dia_matrix

This is a diagonal storage method. Columns represent diagonal lines and rows represent rows. If all elements on the diagonal line are 0, this parameter is omitted.

If the original matrix is a matrix with good diagonal properties, the compression ratio will be very high.

After finding a picture on the network, you can easily understand the principle.

5. csr_matrix and csc_matrix

Csr_matrix, full name: Compressed Sparse Row, which compresses the matrix by Row. CSR requires three types of data: value, column number, and row offset. CSR is an encoding method. The value and column number are the same as those in coo. Line offset indicates the starting offset of the first element of a row in values.

I also found a picture on the network to better reflect the principle.

Let's see how to use it in python:

>>> from scipy.sparse import csr_matrix>>> indptr = np.array([0, 2, 3, 6])>>> indices = np.array([0, 2, 2, 0, 1, 2])>>> data = np.array([1, 2, 3, 4, 5, 6])>>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray()array([[1, 0, 2],    [0, 0, 3],    [4, 5, 6]])

It's not hard to understand.

Let's take a look at what the document says.

 Notes | ----- | | Sparse matrices can be used in arithmetic operations: they support | addition, subtraction, multiplication, division, and matrix power. | | Advantages of the CSR format |  - efficient arithmetic operations CSR + CSR, CSR * CSR, etc. |  - efficient row slicing |  - fast matrix vector products | | Disadvantages of the CSR format |  - slow column slicing operations (consider CSC) |  - changes to the sparsity structure are expensive (consider LIL or DOK)

It is not hard to see that csr_matrix is suitable for real matrix operations.

As for csc_matrix, similar to csr_matrix, it is only column-based compression and is not described separately.

6. bsr_matrix

Block Sparse Row format, as the name implies, compresses the matrix based on the concept of blocks.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.