Recently in the analysis of a project, saw a suffix ". SQLite" Data file, because the previous did not contact, thinking how to use Python to open and carry out data analysis and processing, so a little research.
SQLite is a very popular relational database, because it is very light, so it is used by a large number of applications.
Like a CSV file, SQLite can store data in a single data file for easy sharing with other people. Many programming languages support the processing of SQLite data, and the Python language is no exception.
Sqlite3 is a standard library of Python that can be used to process SQLite databases.
Creating and manipulating database files with Sqlite3
For database SQL statements, this article will use the most basic SQL statements, should not affect reading. For further information, refer to the following Web site:
Http://www.w3school.com.cn/sql/index.asp
Next, let's apply the Salite3 module to create SQLite data files, as well as data read and write operations. The main steps are as follows:
- Establish a connection to the database, create a database file (. sqlite file)
- Creating Cursors (Cursor)
- Create a data table (table)
- Inserting data into a data table
- Querying data
The demo code is as follows:
import sqlite3 with sqlite3.connect(‘test_database.sqlite‘) as con:
c = con.cursor()
c.execute(‘‘‘CREATE TABLE test_table
(date text, city text, value real)‘‘‘) for table in c.execute("SELECT name FROM sqlite_master WHERE type=‘table‘"):
print("Table", table[0])
c.execute(‘‘‘INSERT INTO test_table VALUES
(‘2017-6-25‘, ‘bj‘, 100)‘‘‘)
c.execute(‘‘‘INSERT INTO test_table VALUES
(‘2017-6-25‘, ‘pydataroad‘, 150)‘‘‘)
c.execute("SELECT * FROM test_table")
print(c.fetchall())
Table test_table
[(‘2017-6-25‘, ‘bj‘, 100.0), (‘2017-6-25‘, ‘pydataroad‘, 150.0)]
About the data in the SQLite database Visual preview, there are a lot of tools can be implemented, I use the SQLite Studio, is a free use of the tool, do not need to install, download it can be used, interested students can refer to the following links.
Https://sqlitestudio.pl/index.rvt?act=download
The results of the data preview are as follows:
Using pandas to read SQLite data files
From the running results of the above code, it can be seen that the result of the data query is a list composed of tuple. Python's list data can be inconvenient when it comes to further data processing and analysis. Imagine that if there are 1 million rows or more of data in a table in a database, iterating through the list to fetch data is inefficient.
At this point, we can consider using the function provided by pandas to read the relevant data information from the SQLite database file and save it in dataframe for further processing.
Pandas provides two functions that can read information that is suffixed with the ". SQLite" Data file.
- Read_sql ()
- Read_sql_query ()
import pandas as pd
with sqlite3.connect (‘test_database.sqlite’) as con:
# read_sql_query and read_sql can obtain data information from the database file through SQL statements
df = pd.read_sql_query ("SELECT * FROM test_table", con = con)
# df = pd.read_sql ("SELECT * FROM test_table", con = con)
print (df.shape)
print (df.dtypes)
print (df.head ())
(twenty three)
date object
city object
value float64
dtype: object
date city value
0 2017-6-25 bj 100.0
1 2017-6-25 pydataroad 150.0
Python reads and processes data files with a file suffix of ". SQLite"