1. In http://www.lfd.uci.edu/~gohlke/pythonlibs/#mysql-python download the corresponding version of the required dependency package;For example my Python version is python3.5, to download the corresponding version of the NumPy dependent package for numpy-1.11.1+mkl-cp35-cp35m-win_amd64.whl,cp35-cp35m is the corresponding python3.5 version,win_amd64 corresponds to a 64-bit system under Windows .2. Save the downloaded dependency package to the Scripts folder in the Python installation folder, my
install:python27\scripts, run Python ez_setup.py4.3) Install pip:python27\scripts, run Easy_install pip5. MatplotlibIn addition to the above 4 items, it is also important to note:1) Dateutil 1.1 or laterProvides extensions to Python datetime handling. If using PIP, Easy_install or installing from source, the installer would attempt to download and install from python_dateutil P YPI.: PYTHON_DATEUTIL-2.4.2-PY2.PY3-NONE-ANY.WHL2) pyparsingRequired for matplotlib’s mathtext math rendering support.
Operating environment: PYTHON3.6+WINDOWS64 bit1. Install PIP(1) If you have the option to tick about PIP when installing python3.6, the installation file with PIP will be available in python3,6Installation Method:Main: http://www.lfd.uci.edu/~gohlke/pythonlibs/Follow these steps to install: use a command prompt (cmd), preferably running as an administrator. Execute the CD command in CMD to the Python installation directory, under the Execute CD command to its scripts folder, under this folder, t
This article and everyone to share is mainly pandasLibrary Common FunctionsRelated content, come together to look at it, hope to everyone learn pandas helpful. 1. DataFrameHandling Missing valuesPandas. Dataframe.dropna Df2.dropna (axis=0, how= ' any ', subset=[u ' ToC '), inplace=True)put inTocrows with missing values are removed 2.calculate duplicate rows based on a dimensionPandas. Dataframe.duplicated Printdf.duplicated ([' Name ']). Value_counts
1. Create a dataframe from a dictionary>>>ImportPandas>>> dict_a = {'user_id':['Webbang','Webbang','Webbang'],'book_id':['3713327','4074636','26873486'],'rating':['4','4','4'],'mark_date':['2017-03-07','2017-03-07','2017-03-07']}>>> df = Pandas. DataFrame (DICT_A)#Create a dataframe from a dictionary>>> DF#The created DF column names are sorted alphabetically by default, and the order in the dictionary is not the same, the dictionary is ' user_id ', '
Data filtering and sorting------Explore 2012 Euro Cup dataRelated data See (github)Step 1-Import the Pandas libraryimport Pandas as PDStep 2-Data set" ./data/euro2012.csv " # Euro2012.csvStep 3-Name the dataset euro12Euro12 = pd.read_csv (path2) euro12.tail ()Output:
Team
goals
Shots on target
Shots off target
Shooting accuracy
% goals-to-shotsTotal
Shots
Here is still to recommend my own built Python development Learning Group: 483546416, the group is the development of Python, if you are learning Python, small series welcome you to join, everyone is the software Development Party, not regularly share dry goods (only Python software development-related), Including a copy of my own 2018 of the latest Python advanced materials and high-level development tutorials, welcome to the next step and into the small partners who want to dive into python.An
Pandas mainly has 4 of the time-related types. Timestamp, Period, Datetimeindex,periodindex.ImportPandas as PDImportNumPy as NP##TimestampPd. Timestamp ('9/1/2016 10:05am')#output:timestamp (' 2016-09-01 10:05:00 ')##PeriodPd. Period ('1/2016')#output:period (' 2016-01 ', ' M ')Pd. Period ('3/5/2016')#output:period (' 2016-03-05 ', ' D ')##DatetimeindexT1 = PD. Series (List ('ABC'), [PD. Timestamp ('2016-09-01'), PD. Timestamp ('2016-09-02'), PD. Time
Delete one or more columns of Pandas Dataframe:method One : Direct del df[' Column-name ']method Two : Using the Drop method, there are three types of equivalent expressions:1. df= df.drop (' column_name ', 1);2. Df.drop (' column_name ', Axis=1, Inplace=true)3. Df.drop ([df.columns[[0,1, 3]], axis=1,inplace=true) # Note:zero indexedNote : Usually there is a inplace optional parameter that modifies the original array and returns a new array. If set to
Today, I want to pandas in the row of the operation, looking for a long time to find the relevant functions
First look at a small example
From pandas import Series, dataframe
data = Dataframe ({' K ': [1, 1, 2, 2]})
print data
isduplicated = DATA.DUPL icated ()
print isduplicated
print type (isduplicated)
data = Data.drop_duplicates ()
print data
The results of the execution are:
K
0
Import NumPy as NP from
Pandas import dataframe
import pandas as PD
Df=dataframe (Np.arange () reshape (3,4 ), index=[' One ', ' two ', ' THR '],columns=list (' ABCD ')
df[' A ' #取a列
df[[' A ', ' B ']] #取a, column B
#ix可以用数字索引, You can also use index and column indexes
df.ix[0] #取第0行
df.ix[0:1] #取第0行
df.ix[' one ': ' Two '] #取one, two row
df.ix[0:2,0] #取第0 , 1 rows, No. 0 column
df.ix[0:1, ' a '] #取第0行,
I believe many people like me in the process of learning Python,pandas data selection and modification has a great deal of confusion (perhaps by the Matlab) impact ...
To this day finally completely figure out ...
Let's start with a data box manually.
Import NumPy as NP
import pandas as PD
DF = PD. Dataframe (Np.arange (0,60,2). Reshape (10,3), columns=list (' abc ')DF is such a drop
So what are the three
"Python for Data analysis" sort sort_index ()
To sort rows or column indexes
In [1]: Import pandas as PD
in [2]: From pandas import Dataframe, Series in
[3]: obj = Series (range (4), index=[' d ' , ' A ', ' B ', ' C '] in
[4]: obj
out[4]:
d 0
a 1
b 2
c 3
Dtype:int64 In
[5]: Obj.sort_index ()
OUT[5]:
a 1
b 2
c 3
d 0
Dtype:int64 in
[6]: Import NumPy as NP In
[8]: frame = Datafram
American Group Shop Evaluation Language Processing and classification (NLP)
The First Data Analysis section
The second visualization section,
This article is the third of the series, text classification
The main use of the package has Jieba,sklearn,pandas, this post mainly uses the word bag model (bag of words), the text in the form of a numerical feature vector (each document constructs a eigenvector, there are a lot of 0, the value ap
Function Prototypes:Https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html#pandas.DataFrame.fillnaPad/ffill: Fills the missing value with the previous non-missing valueBackfill/bfill: Fills the missing value with the next non-missing valueNone: Specify a value to replace the missing value
123456789101112131415161718192021st22232425262728293031323334353637383940414243444546474849505152535455565758596061 62 63
Original: Chapter 7
# usual opening
%matplotlib inline
import pandas as PD
import matplotlib.pyplot as Plt
import NumPy as NP
# make diagram Table bigger and prettier
pd.set_option (' Display.mpl_style ', ' Default ')
plt.rcparams[' figure.figsize '] = (5)
plt.rcparams[' font.family ' = ' sans-serif '
# need to show a lot of columns in Pandas 0.12
# in Pandas
absrtact: This article is mainly in the pandas how to split the string. Let's consider the following scenario.
This is our dataset (data), and you can see that a column (name) in the dataset is a category for an industry. Symbols ' | ' Between industries Segmentation. We're going to use each ' | ' Extract the contents of the partition. Pandas has a step-by-step approach to the place, very convenient.
Import
1. In the dataframe of pandas, we often need to select the rows of a specified condition based on a property, at which point the Isin method is particularly effective.
Import pandas as PD
DF = PD. Dataframe ([[1,2,3],[1,3,4],[2,4,3]],index = [' One ', ' two ', ' three '],columns = [' A ', ' B ', ' C '])
print DF
# A B C
# One 1 2 3
# two 1 3 4
# three 2 4 3
Let's say we choose a row w
Use Python for data analysis _ Pandas _ basic _ 2, _ pandas_2Reindex method of Series reindex
In [15]: obj = Series([3,2,5,7,6,9,0,1,4,8],index=['a','b','c','d','e','f','g', ...: 'h','i','j'])In [16]: obj1 = obj.reindex(['a','b','c','d','e','f','g','h','i','j','k'])In [17]: obj1Out[17]:a 3.0b 2.0c 5.0d 7.0e 6.0f 9.0g 0.0h 1.0i 4.0j 8.0k NaNdtype: float64
If the current value of the new index is missing, interpolatio
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.