To read a CSV file using pandas

Last Update:2018-04-21 Source: Internet

Author: User

Tags pandas read csv

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Below for you to share an article using pandas read CSV file specified column method, has a good reference value, I hope to be helpful to everyone. Come and see it together.

According to the tutorial implementation of reading the CSV file in front of the first few lines of data, you can think of is not possible to implement the previous columns of data. After a lot of attempts to finally try out a method.

The reason I want to read the previous columns is because I have a CSV file on hand that has no data available in the next few columns, but it always exists. The original data is as follows:

Greydemac-mini:chapter06 greyzhang$ Cat Data.csv

1,name_01,coment_01,,,, 2,name_02,coment_02,,,, 3,name_03,coment_03,,,, 4,name_04,coment_04,,,, 5,name_05,coment_ ,,,, 6,name_06,coment_06,,,, 7,name_07,coment_07,,,, 8,name_08,coment_08,,,, 9,name_09,coment_09,,,, 10,name_10, Coment_10,,,, 11,name_11,coment_11,,,, 12,name_12,coment_12,,,, 13,name_13,coment_13,,,, 14,name_14,coment_14,,,, 15,name_15,coment_15,,,, 16,name_16,coment_16,,,, 17,name_17,coment_17,,,, 18,name_18,coment_18,,,, 19,name_19, coment_19,,,, 20,name_20,coment_20,,,, 21,name_21,coment_21,,,,

If you use pandas to read all the data, the following results will appear when you print:

in [+]: data = pd.read_csv (' data.csv ')

In []: dataout[42]: 1 name_01 coment_01 unnamed:3 unnamed:4 unnamed:5 unnamed:60 2 name_02 coment_02 nan nan   Nan NaN1 3 name_03 coment_03 nan nan nan NaN2 4 name_04 coment_04 nan nan nan NaN3 5 name_05 coment_05 Nan nan nan NaN4 6 name_06 coment_06 nan nan nan NaN5 7 name_07 coment_07 nan nan nan NaN6 8 name_   Coment_08 nan nan nan NaN7 9 name_09 coment_09 nan nan nan NaN8 ten name_10 coment_10 nan nan nan   NaN9 name_11 coment_11 nan nan nan NaN10 name_12 coment_12 nan nan nan NaN11-name_13 coment_13  Nan nan nan NaN12 name_14 coment_14 nan nan nan NaN13 name_15 coment_15 nan nan nan NaN14 16 Name_16 coment_16 nan nan nan NaN15 name_17 coment_17 nan nan nan NaN16 name_18 coment_18 nan N An nan NaN17 name_19 coment_19 nan nan, nan NaN18 name_20 coment_20 nan nan nan NaN19 (name_21)   Coment_21 nan NanNan Nan

It doesn't give me any obstacles in the course of learning, but I always prefer a slightly fresher style in the command-line terminal interface. Using the Read_csv parameter usecols can reduce this confusion to some extent.

In []: data = pd.read_csv (' Data.csv ', usecols=[0,1,2,3])

In []: dataout[46]:   1 name_01 coment_01 unnamed:30 2 name_02 coment_02   NaN1 3 name_03 coment_03   NaN2 4 Name _04 coment_04   NaN3 5 name_05 coment_05   NaN4 6 name_06 coment_06   NaN5 7 name_07 coment_07   NaN6 8 name_08 Co ment_08   NaN7 9 name_09 coment_09   NaN8 ten name_10 coment_10   NaN9 one name_11 coment_11 NaN10 one   Name_12 C Oment_12   NaN11 name_13 coment_13 NaN12 name_14 coment_14 NaN13-name_15 coment_15 NaN14-   na Me_16 coment_16   NaN15 name_17 coment_17 NaN16-name_18 coment_18 NaN17-   name_19   coment_19 NaN18 name_20 coment_20   NaN19 name_21 coment_21   NaN

In order to be able to see the "boundary" of the data, the first column of invalid data is displayed when reading. In normal use, perhaps we want to remove the information from the last column of the above results, which only requires the column number of the last column to be removed in the parameter.

in [+]: data = pd.read_csv (' Data.csv ', usecols=[0,1,2])

In []: dataout[48]:   1 name_01 coment_010 2 name_02 coment_021 3 name_03 coment_032 4 name_04 coment_043 5 name_05 Co ment_054 6 name_06 coment_065 7 name_07 coment_076 8 name_08 coment_087 9 name_09 coment_098 ten name_10 coment_109 one name _11 coment_1110, Name_12 coment_1211, name_13 coment_1312, name_14 coment_1413, name_15 coment_1514, name_16 com ent_1615 name_17 coment_1716 name_18 coment_1817, name_19 coment_1918 (name_20 coment_2019) name_21 coment_21

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More