Python reads text file data

Source: Internet
Author: User

This article points just to:

(a) Read the data function of the text file format: read_csv,read_table

1. Read the text file of different separators, with the parameter Sep

2. Read the text file without field name (header), with parameter names

3. Index a text file, using Index_col

4. Skip read text file, with SkipRows

5. When the data is too large, you need to block-read the text data by blocks using Chunksize.

(ii) write data into a text file Format function: To_csv

Examples are as follows:

(a) reading a data set in a text file format

The difference between 1.read_csv and read_table:

#read_csv默认读取用逗号分隔符的文件, you do not need to specify a delimiter with Sep

Import Pandas as PD
Pd.read_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.csv ')

 

#read_csv如果读的是用非逗号分隔符的文件, you must use Sep to specify the separator, otherwise read out is the original file, the data is not split open import pandas as Pdpd.read_csv (' c:\\users\\ Xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt ')

  

#与上面的例子可以对比一下区别Import pandas as Pdpd.read_csv (' c:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\ Data.txt ', sep= ' | ')

  

#read_table读取文件时必须要用sep来指定分隔符, otherwise read out the data is the original file, not split open. Import Pandas as pdpd.read_table (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.csv ')

 

#read_table读取数据必须指定分隔符Import pandas as pdpd.read_table (' c:\\users\\xiaoxiaodexiao\\pythonlianxi\\ Test0424\\data.txt ', sep= ' | ')

  

2. When a text file is read without headers and names specifying the header, the default First behavior table header

#用header =none indicates that the dataset does not have a header, the header and index pd.read_table are populated with Arabic numerals by default (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\ Test0424\\data.txt ', sep= ' | ', Header=none)

  

#用names可以自定义表头pd.read_table (' c:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt ', sep= ' | ',
names=[' x1 ', ' x2 ', ' x3 ', ' x4 ', ' x5 '])

 

3. Specify the index with Arabic numerals by default; Specify a column as an index with Index_col

names=[' x1 ', ' x2 ', ' x3 ', ' x4 ', ' x0 ']pd.read_table (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt ', Sep= ' | ',                   names=names,index_col= ' x0 ')

  

4. The following example uses SkipRows to read other row data after skipping the row of hello, regardless of whether the first row is the header, the header is the beginning of the No. 0 row


You can compare the differences between the three examples to understand

Pd.read_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt ')

names=[' x1 ', ' x2 ', ' x3 ', ' x4 ', ' x0 ']pd.read_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt ', Names=names,            skiprows=[0,3,6])

  

Pd.read_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt ',            skiprows=[0,3,6])

  

Pd.read_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt ', Header=none,            skiprows=[0,3,6])

  

5. Block read, a total of 8 rows of data in the data1.txt, according to each block of 3 lines, will read 3 times, the first 3 lines, the second 3 rows, the third 1 rows of data to read.


Note that this is different from the skip when it comes to chunking, the table header is not read as the first line, and can be understood by a comparison of two examples.

Chunker = pd.read_csv (' c:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt ', chunksize=3) for M in Chunker:       print (len (m))     print M

  

Chunker = pd.read_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt ', Header=None,                      chunksize= 3) for M in Chunker:        print (len (m))     print M

  

(ii) Writing data to text format with To_csv


Taking Data.txt as an example, note that when writing a file, the index is also written to the

Data=pd.read_table (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt ', sep= ' | ') Print data

  

#可以用index =false The write of the Forbidden Index. data=pd.read_table (' c:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt ', sep= ' | ') Data.to_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\outdata.txt ', sep= '! ', Index=False)

  

#可以用columns指定写入的列data=pd.read_table (' c:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt ', Sep = ' | ') Data.to_csv (' C:\\users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\outdata2.txt ', sep= ', ', Index=False,            columns =[' A ', ' C ', ' d '])

  

 

 

 

Python reads text file data

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.