Data filtering and sorting------Explore 2012 Euro Cup dataRelated data See (github)Step 1-Import the Pandas libraryimport Pandas as PDStep 2-Data set" ./data/euro2012.csv " # Euro2012.csvStep 3-Name the dataset euro12Euro12 = pd.read_csv (path2) euro12.tail ()Output:
Team
goals
Shots on target
Shots off target
Shooting accuracy
% goals-to-shotsTotal
Shots
columns]
selection¶
Note
While standard python/numpy expressions for selecting and setting are intuitive and come with handy for interactive, For production Code, we recommend the optimized pandas data access methods,. At,. IAT,. Loc,. Iloc and. IX.
The indexing section and below. getting¶
Selecting a single column, which yields a Series, equivalent to DF. A
in [[]: df[' A ']
out[23]:
2013-01-01 0.4
3
6
H
7
3
7
I
8
3
8
J
9
3
9
By using *loc, we can select some of the data in the Dataframe.
Df.loc[' a ']
Rev. 0
Test 3
col 0
name:a, Dtype:int64
# df.loc[starting index (included): Terminating index (inclusive)]
df.loc[' a ': ' d ']
Rev
Test
Col
A
0
3
0
B
1
3
1
C
2
3
2
Http://www.cnblogs.com/batteryhp/p/5006274.htmlPandas is the preferred library for subsequent content in this book. The pandas can meet the following requirements:
Data structure with automatic or explicit data alignment by axis. This prevents many common errors caused by data misalignment and data from different data sources (indexed differently).
Integrated time series capabilities
Data structures that can handle time series data as
already has column name, use data [' col1 '] to choose to take out an entire column of data. If you know column names and index, you can choose. loc simultaneously row and column selection: Data.loc[index, ' colum_names '] iloc functionUse the method with the LOC function, but no longer enter the column name, but the index:data.iloc[row_index,col_index of the input column]The functions of the IX function IX are more powerful, and the parameters can b
How do I delete the list hollow character?Easiest way: New_list = [x for x in Li if x! = ']This section mainly learns the basic operations of pandas based on the previous two data structures.设有DataFrame结果的数据a如下所示: a b cone 4 1 1two 6 2 0three 6 1 6
First, view the data (the method of viewing the object is also applicable for series)1. View Dataframe before XX line or after XX lineA=dataframe (data);A.head (6) indicates that
value
df.pivot_table (Index=col1, values=[col2,col3], Aggfunc=max) for column col2 after grouping by column col1 : Create a pivot table Df.groupby (col1) that groups col1 by column and calculates the maximum values for col2 and col3
. Agg (Np.mean): Returns the mean value of all columns grouped by column col1
( Np.mean): Apply function Np.mean data.apply (Np.max,axis=1) to each column in Dataframe
: Apply function to each row in Dataframe Np.max
Other operations:
Change column name:
Method 1
Import NumPy as NP from
Pandas import dataframe
import pandas as PD
Df=dataframe (Np.arange () reshape (3,4 ), index=[' One ', ' two ', ' THR '],columns=list (' ABCD ')
df[' A ' #取a列
df[[' A ', ' B ']] #取a, column B
#ix可以用数字索引, You can also use index and column indexes
df.ix[0] #取第0行
df.ix[0:1] #取第0行
df.ix[' one ': ' Two '] #取one, two row
df.ix[0:2,0] #取第0 , 1 rows, No. 0 column
df.ix[0:1, ' a '] #取第0行,
American Group Shop Evaluation Language Processing and classification (NLP)
The First Data Analysis section
The second visualization section,
This article is the third of the series, text classification
The main use of the package has Jieba,sklearn,pandas, this post mainly uses the word bag model (bag of words), the text in the form of a numerical feature vector (each document constructs a eigenvector, there are a lot of 0, the value ap
How do I delete the list hollow character?
Easiest way: New_list = [x for x in Li if x! = ']
Today is number No. 5.1.
This section mainly learns the basic operations of pandas based on the previous two data structures.
Data A with dataframe results is shown below: a b cone 4 1 1two 6 2 0three 6 1 6
First, view the data (the method of viewing the object is also applicable for series)
1. View Dataframe before XX line or
If you do any data analysis in the Python language, you might use pandas, a wonderful analysis library written by Wes McKinney. By giving Python data frames to analyze functionality, pandas has effectively placed Python in the same position as some of the more sophisticated analysis tools such as R or SAS.Add QQ group 813622576 or Vx:tanzhouyiwan free to receive Python learning materialsUnfortunately, in th
', DF ['v1']) #2 indicates the insert position, and V6 indicates the column name, DF ['v1 '] is the inserted value print ('insert column:') print (DF, '\ n') print (' * 50)
4. General selection methods:
Operation Method
Method
Result
Select a column
Def [col]
Sequence
Select a row using column tags
DF. Loc [col]
Sequence
Select a row by location
DF. icol [2]
Sequence
Line Cutting
DF [5: 10]
Data box
The following for you to share a Python data Analysis Library Pandas basic operation method, has a good reference value, I hope to help you. Come and see it together.
What is Pandas?
Is it it?
。。。。 Apparently pandas is not so cute as this guy ....
Let's take a look at how Pandas's official website defines itself:
Pandas
Configuration
All running nodes are installed Pyarrow, need >= 0.8 Why there is pandas UDF
Over the past few years, Python is becoming the default language for data analysts. Some similar pandas,numpy,statsmodel,scikit-learn have been used extensively, becoming the mainstream toolkit. At the same time, Spark became the standard for big data processing, and in order for data analysts to use spark, Spark add
This article brings the content is about Python pandas in-depth understanding (code example), there is a certain reference value, the need for friends can refer to, I hope to help you.
First, screening
First, create a 6X4 matrix data.
Dates = Pd.date_range (' 20180830 ', periods=6) df = PD. DataFrame (Np.arange) reshape ((6,4)), index=dates, columns=[' A ', ' B ', ' C ', ' D ']) print (DF)
Print:
A B C d2018-08-30 0 1 2 320
Abstract:Pandas is a powerful Python data Analysis Toolkit, Pandas's two main data Structures series (one-dimensional) and dataframe (two-dimensional) deal with finance, statistics, most typical use case science in society, and many engineering fields. In Spark, the Python program can be easily modified, eliminating the need for Java and Scala packaging, and if you want to export files, you can convert the data to pandas and save it to Csv,excel.What
This article mainly introduces the use of Python in the Pandas Library for CDN Log analysis of the relevant data, the article shared the pandas of the CDN log analysis of the complete sample code, and then detailed about the pandas library related content, the need for friends can reference, the following to see together. Foreword recently encountered a demand in
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.