Python sparse matrix-sparse storage and conversion, python matrix-sparse

Last Update:2017-06-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sparse Matrix-sparsep

from scipy import sparse

Storage form of Sparse Matrix

Many large matrices often appear when solving linear models in the scientific and engineering fields. Most of these matrices have 0 elements, which are called sparse matrices. Saving such a matrix using the NumPy ndarray wastes memory. Due to the sparse nature of the matrix, you can save only information about non-zero elements to save memory usage. In addition, writing operation functions for this special structure of the matrix can also improve the calculation speed of the matrix.

The scipy. sparse Library provides multiple sparse matrix formats, each of which has different uses. dok_matrix and lil_matrix are suitable for gradually adding elements.

Dok_matrix is inherited from dict. It uses a dictionary to save elements not 0 in the matrix. The dictionary key is a tuple that stores the information of elements (rows and columns, the corresponding value is the element value in the matrix (row, column. Obviously, the sparse matrix in the dictionary format is suitable for adding, deleting, and accessing a single element. It is usually used to gradually add non-zero elements and convert them to other formats that support quick operations.

a = sparse.dok_matrix((10, 5))a[2:5, 3] = 1.0, 2.0, 3.0print a.keys()print a.values()

[(2, 3), (3, 3), (4, 3)][1.0, 2.0, 3.0]

Lil_matrix uses two lists to save non-zero elements. Data stores non-zero elements in each row, and rows stores the columns of non-zero elements. This format is also suitable for adding elements one by one and getting row-related data quickly.

b = sparse.lil_matrix((10, 5))b[2, 3] = 1.0b[3, 4] = 2.0b[3, 2] = 3.0print b.dataprint b.rows

[[] [] [1.0] [3.0, 2.0] [] [] [] [] [] []][[] [] [3] [2, 4] [] [] [] [] [] []]

Coo_matrix uses three Arrays: row, col, and data to store information of non-zero elements. These three arrays have the same length. row stores the row of the element, col stores the column of the element, and data stores the value of the element. Coo_matrix does not support element access, addition, and deletion. After being created, in addition to converting it into a matrix of other formats, it is almost impossible to perform any operations or matrix operations on it.

Coo_matrix supports repeated elements, that is, the same column and column coordinates can appear multiple times. When converted to a matrix of other formats, multiple values corresponding to the same column and column coordinates are summed. In the following example, (2, 3) corresponds to two values: 1 and 10. When the two values are converted to an ndarray, the two values are combined. Therefore, in the final matrix (2, 3, 3) The coordinate value is 11.

Many sparse matrix data are stored in files in this format. For example, a CSV file may have three columns: "User ID, product ID, and evaluation value ". Use numpy. loadtxt or pandas. after read_csv reads data, it can be quickly converted to a sparse matrix using coo_matrix: each row of the matrix corresponds to one user, and each column corresponds to one item, the element value is the user's evaluation of the product.

row = [2, 3, 3, 2]col = [3, 4, 2, 3]data = [1, 2, 3, 10]c = sparse.coo_matrix((data, (row, col)), shape=(5, 6))print c.col, c.row, c.dataprint c.toarray()

[3 4 2 3] [2 3 3 2] [ 1 2 3 10][[ 0 0 0 0 0 0] [ 0 0 0 0 0 0] [ 0 0 0 11 0 0] [ 0 0 3 0 2 0] [ 0 0 0 0 0 0]]

In personal operations, coo_matrix is selected because it involves sparse matrix operations. However, if it is not stored in other forms, the complexity is too high (Time and Space). About 2 h for a matrix of 1000x1000, it's terrible. Instead, I thought of the data input format triple in the Pajek software:

So I want to process my data into a similar triple!

That is, "matrix"-> "tuple triple"-> "sparseMatrix2tuple"-> "scipy. sparse"

Thank you for reading this article. I hope it will help you. Thank you for your support for this site!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python sparse matrix-sparse storage and conversion, python matrix-sparse

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python sparse matrix-sparse storage and conversion, python matrix-sparse

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support