There are very, very many operations on the processing of time this property in pandas. You can refer to the following links:
Pandas
And this article on one of the people may be more unfamiliar to explain the method. I will upload the rest.
The application scenario is this: given a dataset, the data set has a user's registered account time (year-month-day), as sh
ordered data such as time series, it may be necessary to do some interpolation when re-indexing, the method option can achieve this purpose:For ordered data such as time series, it may be necessary to do some interpolation when re-indexing, the method option can achieve this purpose:
Method Parameter Introduction
Parameters
Description
Ff
For example we have the dataframe like this: SPY AAPL IBM GOOG GLD2017-01-03 222.073914 114.311760 160.947433 786.140015 110.4700012017-01-04 223.395081 114.183815 162.940125 786.900024 110.8600012017-01-05 223.217606 114.764473 162.401047 794.020020 112.5800022017-01-06 224.016220 116.043915 163.200043 806.150024 111.7500002017-01-09 223.276779 117.106812 161.390244 806.650024 112.669998...Now we only we want to get highli
values appearDf.boxplot (column= ' label 1 ', by = ' Label 2 ')Plt.show ()The data under label 1 can then be plotted in a numerical distribution according to label 2As indicated below, it has been classified according to the level of education, high-level wage extremes, and other conclusions can be obtainedNote: When you want to paint, the individual input drawing instructions can not display graphics, then you need to enter Plt.show () on another li
Problem Description: Run the following program to generate the hotel turnover simulation data file in the current folder Data.csvThen complete the following tasks:1) Use Pandas to read the data in the file Data.csv, create the Dataframe object, and delete all of the missing values;2) Use Matplotlib to generate line chart, reflect the daily turnover of the hotel,
The most by a friend set up a part-time operation of the company, but the need for some part-time staff pay, but due to a part-time wage between the 40~60, so the company adopted the principle is more than 200 to carry out, this rule is equivalent to drop the driver, the withdrawal needs more than 200, Then the problem came, in order to better let a large number of part-time staff can, clearly understand the time period in which they earn a lot of money, this time extended a problem, we need to
Label:Read the contents of the table, as in the following example: ImportMySQLdbTry: Conn= MySQLdb.connect (host='127.0.0.1', user='Root', passwd='Root', db='MyDB', port=3306) DF= Pd.read_sql ('select * from test;', con=conn) Conn.close ()Print "Finish Load DB"
exceptmysqldb.error,e:PrintE.ARGS[1] Write the data to the table, as in the following example DF = PD. DataFrame ([[1,'XXX'],[2,'yyy']],columns=list ('AB'))
Try: Conn= MySQLdb.connect (host='1
Pandas get column data bits common functions, but there are some things to note in the wording, here to summarize:Import Pandas as Pddata1 = PD. DataFrame (...) #任意初始化一个列数为3的DataFramedata1. columns=[' A ', ' B ', ' C ']1.data1[' B '] #这里取到第2列 (i.e. column B), the value of the 2.data1.b# effect is the same as 1, Take the 2nd column (that is, column B) #这里b为列名称, bu
Objective
Pandas is a numpy built with more advanced data structures and tools than the NumPy core is the Ndarray,pandas is also centered around Series and dataframe two core data structures. Series and Dataframe correspond to one-dimensional sequence and two-dimensional table structure respectively. Pandas's conventi
Getting started with Python for data analysis--pandas
Based on the NumPy established
from pandas importSeries,DataFrame,import pandas as pd
One or two kinds of data structure 1. Series
A
The main tasks of data preprocessing are:
First, data preprocessing
1. Data cleaning
2. Data integration
3. Data Conversion
4. Data reduction
1. Data cleaningReal-world
Ming 6.0 - Name:price, Dtype:float64 -Zhang San 1.2 theReese 1.0 -Harry 2.3 -Chen Jiu 5.0 -Xiao Ming 6.0 +Name:price, Dtype:float64 In general, we often need to value by column, then Dataframe provides loc and Iloc for everyone to choose from, but the difference is between the two.1 Print(frame2)2 Print(frame2.loc['Harry'])#Loc can use the index of the string type, whereas the Iloc can only be of type int3 Print(frame0.iloc[2])4 out[2]: 5 Color Object Price6Zhang San Blue ball 1.27Reese Green
1.1. Pandas Analysis steps
Loading data
COUNT the date of the access_time. SQL similar to the following:
SELECT date_format (access_time, '%H '), COUNT (*) from log GROUP by Date_format (access_time, '%H ');
1.2. Code
Cat pd_ng_log_stat.py#!/usr/bin/env python#-*-Coding:utf-8-*-From Ng_line_parser import NglineparserImport
Operating system: Windowspython:3.5Welcome to join the Learning Exchange QQ Group: 657341423
The previous section describes the library of data analysis and mining needs, the most important of which is pandas,matplotlib.Pandas: Mainly on data analysis, calculation and statistics, such as the average, square bad.Matplotlib: The main combination of
ImportOSImportPandas as PDImportMatplotlib.pyplot as PltdefTest_run (): start_date='2017-01-01'End_data='2017-12-15'dates=Pd.date_range (start_date, End_data)#Create an empty data frameDF=PD. DataFrame (index=dates) Symbols=['SPY','AAPL','IBM','GOOG','GLD'] forSymbolinchsymbols:temp=getadjcloseforsymbol (symbol) DF=df.join (temp, how='Inner') returnDF def Normalize_data (DF): "" " normalize stock prices using the first row of the DATAFR Ame
formed by the list of recursive interceptsRight_list =Merge_sort (list[mid:])#to create an index of left and right cursor record list valuesLeft_pointer, Right_pointer =0,0#Create a new empty listresult = [] #Loop Compare numeric size #exit loop condition when one of the left and right cursors equals the length of the list whileLeft_pointer andRight_pointer Len (right_list):#determining the left and right value sizes ifLeft_list[left_pointer] Right_list[right_pointer]: Result.
"Merge Sort" Here we use recursive algorithm to keep the list in two, base case is no element in the list or only one element, because this sub-list is bound to be a positive sequence, and then gradually merge the two sorted sub-list into a new positive sequence table, until all the elements sorted."This is a process from the bottom up (bottom-up)Divides the list from the middle into two sub-lists until it
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.