To read a CSV file using pandas

Source: Internet
Author: User
Tags pandas read csv
Below for you to share an article using pandas read CSV file specified column method, has a good reference value, I hope to be helpful to everyone. Come and see it together.

According to the tutorial implementation of reading the CSV file in front of the first few lines of data, you can think of is not possible to implement the previous columns of data. After a lot of attempts to finally try out a method.

The reason I want to read the previous columns is because I have a CSV file on hand that has no data available in the next few columns, but it always exists. The original data is as follows:

Greydemac-mini:chapter06 greyzhang$ Cat Data.csv

1,name_01,coment_01,,,, 2,name_02,coment_02,,,, 3,name_03,coment_03,,,, 4,name_04,coment_04,,,, 5,name_05,coment_ ,,,, 6,name_06,coment_06,,,, 7,name_07,coment_07,,,, 8,name_08,coment_08,,,, 9,name_09,coment_09,,,, 10,name_10, Coment_10,,,, 11,name_11,coment_11,,,, 12,name_12,coment_12,,,, 13,name_13,coment_13,,,, 14,name_14,coment_14,,,, 15,name_15,coment_15,,,, 16,name_16,coment_16,,,, 17,name_17,coment_17,,,, 18,name_18,coment_18,,,, 19,name_19, coment_19,,,, 20,name_20,coment_20,,,, 21,name_21,coment_21,,,,

If you use pandas to read all the data, the following results will appear when you print:

in [+]: data = pd.read_csv (' data.csv ')

In []: dataout[42]: 1 name_01 coment_01 unnamed:3 unnamed:4 unnamed:5 unnamed:60 2 name_02 coment_02 nan nan   Nan NaN1 3 name_03 coment_03 nan nan nan NaN2 4 name_04 coment_04 nan nan nan NaN3 5 name_05 coment_05 Nan nan nan NaN4 6 name_06 coment_06 nan nan nan NaN5 7 name_07 coment_07 nan nan nan NaN6 8 name_   Coment_08 nan nan nan NaN7 9 name_09 coment_09 nan nan nan NaN8 ten name_10 coment_10 nan nan nan   NaN9 name_11 coment_11 nan nan nan NaN10 name_12 coment_12 nan nan nan NaN11-name_13 coment_13  Nan nan nan NaN12 name_14 coment_14 nan nan nan NaN13 name_15 coment_15 nan nan nan NaN14 16 Name_16 coment_16 nan nan nan NaN15 name_17 coment_17 nan nan nan NaN16 name_18 coment_18 nan N An nan NaN17 name_19 coment_19 nan nan, nan NaN18 name_20 coment_20 nan nan nan NaN19 (name_21)   Coment_21 nan NanNan Nan 

It doesn't give me any obstacles in the course of learning, but I always prefer a slightly fresher style in the command-line terminal interface. Using the Read_csv parameter usecols can reduce this confusion to some extent.

In []: data = pd.read_csv (' Data.csv ', usecols=[0,1,2,3])

In []: dataout[46]:   1 name_01 coment_01 unnamed:30 2 name_02 coment_02   NaN1 3 name_03 coment_03   NaN2 4 Name _04 coment_04   NaN3 5 name_05 coment_05   NaN4 6 name_06 coment_06   NaN5 7 name_07 coment_07   NaN6 8 name_08 Co ment_08   NaN7 9 name_09 coment_09   NaN8 ten name_10 coment_10   NaN9 one name_11 coment_11 NaN10 one   Name_12 C Oment_12   NaN11 name_13 coment_13 NaN12 name_14 coment_14 NaN13-name_15 coment_15 NaN14-   na Me_16 coment_16   NaN15 name_17 coment_17 NaN16-name_18 coment_18 NaN17-   name_19   coment_19 NaN18 name_20 coment_20   NaN19 name_21 coment_21   NaN

In order to be able to see the "boundary" of the data, the first column of invalid data is displayed when reading. In normal use, perhaps we want to remove the information from the last column of the above results, which only requires the column number of the last column to be removed in the parameter.

in [+]: data = pd.read_csv (' Data.csv ', usecols=[0,1,2])

In []: dataout[48]:   1 name_01 coment_010 2 name_02 coment_021 3 name_03 coment_032 4 name_04 coment_043 5 name_05 Co ment_054 6 name_06 coment_065 7 name_07 coment_076 8 name_08 coment_087 9 name_09 coment_098 ten name_10 coment_109 one name _11 coment_1110, Name_12 coment_1211, name_13 coment_1312, name_14 coment_1413, name_15 coment_1514, name_16 com ent_1615 name_17 coment_1716 name_18 coment_1817, name_19 coment_1918 (name_20 coment_2019) name_21 coment_21

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.