Python array, list, And dataframe index slicing operations: July 22, July 19, 2016-zhi Lang document,Array, list, And dataframe index slicing operations: January 1, July 19, 2016-zhi Lang document
List, one-dimensional, two-dimensional array, datafrme, loc, iloc, and ix
Nump
Array,list,dataframe Index Tile Operation July 19, 2016--smart wave documentA simple discussion on list, one-dimensional, two-dimensional array,datafrme,loc, Iloc and IXNumPy an array of indexes and tiles:Starting with the most basic list index, let's start with a code and result:a = [0,1,2,3,4,5,6,7,8,9] a[:5:-1] #step Output:[9, 8, 7, 6][][1, 0]List slice, i
1. Create a dataframe from a dictionary>>>ImportPandas>>> dict_a = {'user_id':['Webbang','Webbang','Webbang'],'book_id':['3713327','4074636','26873486'],'rating':['4','4','4'],'mark_date':['2017-03-07','2017-03-07','2017-03-07']}>>> df = Pandas. DataFrame (DICT_A)#Create a dataframe from a dictionary>>> DF#The created DF column names are sorted alphabetically by
Tags: fetchall nbsp python class set for SEL statement RAM (Create connection and cursor code omitted here) SQL1="SELECT * FROM table name" #SQL statement 1Cursor1.execute (SQL1)#Execute SQL statement 1Read1=list (Cursor1.fetchall ())#reading Results 1Sql2="SHOW full COLUMNS from table name" #SQL Statement 2Cursor1.execute (SQL2)#Execute SQL statement 2Read2=list (Cursor1.fetchall ())#assign to variable after reading result 2 and conv
This article mainly introduced the Python pandas in the Dataframe type data operation function method, has certain reference value, now shares to everybody, has the need friend to refer to
The Python data analysis tool pandas Dataframe and series as the primary data structures.
This article is mainly about how to oper
:import1 Import matplotlib.pyplot as Plt2 a=series (NP.RANDOM.RANDN (+), Index=pd.date_range (' 20100101 ', periods=1000)) 3 b= A.cumsum () 4 B.plot () 5 plt.show () #最后一定要加这个plt. Show (), or the graph will not appear.2.PNGYou can also use the following code to generate multiple time series diagrams:a=DataFrame(np.random.randn(1000,4),index=pd.date_range(‘2010
']df_obj[' user number '].isin (alist) #将要过滤的数据放入字典中, uses Isin to filter the data, returns the row index and the results of each row filter, and returns if the match is turedf_obj[df_obj[' user number '].isin (alist)] #获取匹配结果为ture的行Filter data using Dataframe blur (like in sql):df_obj[df_obj[' package '].str.contains (R '. * Voice cdma.* ')] #使用正则表达式进行模糊匹配, * match 0 or unlimited, match 0 or 1 timesData c
......dict_data={} #打开文件with open (' File_in.txt ', ' R ') as DF: #读每一行 for line in DF: # If this line is a newline, skip it, use the length of ' \ n ' to find the empty line if line.count (' \ n ') = = Len: continue #对每行清除前后空格 (if any), then use ":" To split For KV in [Line.strip (). Split (': ')]: #按照键, write the value in dict_data.setdefault (kv[0],[]). Append (Kv[1]) #print (dict_ Data) Look at the effect # This is to read the key to become a list columnsname=lis
), columns=['A', 'B', 'C', 'D', 'E'])
DataFrame data preview:
A B C D E0 0.673092 0.230338 -0.171681 0.312303 -0.1848131 -0.504482 -0.344286 -0.050845 -0.811277 -0.2981812 0.542788 0.207708 0.651379 -0.656214 0.5075953 -0.249410 0.131549 -2.198480 -0.437407 1.628228
Calculate the total data of each column and add it to the end as a new column
df['Col_sum'] = df.apply(lambda x: x.sum(), axis=1)
Calculates the total data of each row and adds it to
[' col_sum ' = df.apply (lambda x:x.sum (), Axis=1)
Calculates the sum of each row's data and adds it to the end as a new row
df.loc[' row_sum ' = df.apply (lambda x:x.sum ())
Final data results:
A B C D E col_sum0 0.673092 0.230338-0.171681 0.312303-0.184813 0.8592381-0.504482-0.344286- 0.050845-0.811277-0.298181-2.0090712 0.542788 0.207708 0.651379-0.656214 0.507595 1.2532563-0.249410 0.131549-2.1984 80-0.437407 1.628228-1.125520row_sum 0.461987 0.225310-1.769627-1.592595 1.652828-1.0220
This article is to share with you that Python reads the data from the text and transforms it into an instance of Dataframe, which has a certain reference value, hoping to help people in need
In the technical question and answer to see a question like this, feel relatively common, just open an article write down.
Reads the data from the plain text format file "File_in" in the following format:
The output n
Pandas. DataFrame
pandas. class
DataFrame
(data=none, index=none, columns=none, dtype=none, copy=false) [Source]
Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can is thought of as a dict-li
']], columns=['p1', 'p2 ...: ', 'p3'])In [4]: dfOut[4]: p1 p2 p30 GD GX FJ1 SD SX BJ2 HN HB AH3 HEN HEN HLJ4 SH TJ CQ
If you only want two rows whose p1 is GD and HN, you can do this:
In [8]: df[df.p1.isin(['GD', 'HN'])]Out[8]: p1 p2 p30 GD GX FJ2 HN HB AH
However, if we want data except the two rows, we need to bypass the point.
The principle is to first extract p1 and convert it to a list, then remove unnecessary rows (values) from the list, and then useisin()
In [9]: ex_list = list(df.p1)In [
Let's create a data frame by hand.[Python]View PlainCopy
Import NumPy as NP
Import Pandas as PD
DF = PD. DataFrame (Np.arange (0,2). Reshape (3), columns=list (' abc ' )
DF is such a dropSo how do you choose the three ways to pick the data?One, when each column already has column name, with DF [' a '] can choose to take out a whole column of data. If you know column names and
2 DataFrameA: Dataframe automatically indexed by passing in a list of equal lengths1data={' State':['Ohio','Ohio','Ohio','Nevada','Nevada'],2 ' Year':[ -,2001,2002,2001,2002],3 'Pop':[1.5,1.7,3.6,2.1,2.9]}4Frame=dataframe (data)B: Specify sequential sequence (previously sorted by default)1 DataFrame (data,columns=['year','State',' pop'])C: When the d
lines for GD and HN, you can do this:
In [8]: Df[df.p1.isin ([' GD ', ' HN '])]out[8]: p1 p2 p30 GD GX FJ2 HN HB AH
But if we want data beyond these two lines, we need to get around the point.
The principle is to first remove the P1 and convert it to a list, then remove the unwanted rows (values) from the list and then use them in the Dataframeisin()
In [9]: Ex_list = List (DF.P1) in [ten]: Ex_list.remove (' GD ') in [all]: Ex_list.remove (' HN ') in []: ex_listout[12]: [' SD ', ' HE N ', ' sh
Using Python for data analysis (7)-pandas (Series and DataFrame), pandasdataframe 1. What is pandas? Pandas is a Python data analysis package based on NumPy for data analysis. It provides a large number of advanced data structures and data processing methods. Pandas has two main data structures:SeriesAndDataFrame. Ii. Series Series is a one-dimensional array obje
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.