Let's create a data frame by hand.[Python]View PlainCopy
Import NumPy as NP
Import Pandas as PD
DF = PD. DataFrame (Np.arange (0,2). Reshape (3), columns=list (' abc ' )
DF is such a dropSo how do you choose the three ways to pick the data?One, when each column already has column name, with DF [' a '] can choose to take out a whole column of data. If you know column names and index, and both are well-entered, you can choose.
I believe many people like me in the process of learning Python,pandas data selection and modification has a great deal of confusion (perhaps by the Matlab) impact ...
To this day finally completely figure out ...
Let's start with a data box manually.
Import NumPy as NP
import pandas as PD
DF = PD. Dataframe (Np.arange (0,60,2). Reshape (10,3), columns=list (' abc ')DF is such a drop
So what are the three
browsing data. The default value is 5.
Df. sample (n): Randomly browses n rows of data. The default value is 5 rows.
Df. shape: the number of rows and columns of the tuple type)
Df. describe (): Calculate the evaluation data Trend
Df.info (): memory and Data Type
3. It is easy to add columns to DataFrame in DataFrame. The following describes several methods.
Simple Method
Directly add new columns and assign values
Df ['new _ column'] = 1
Calculation Method
Df ['temp _ diff '] = df ['tem
section "Getting Started with data structures (Intro to data Structures)". Open this page next to your Jupyter notebook. When you read the document, write down (rather than copy) the code and execute it in the notebook. As you execute your code, explore these operations and try to explore new ways to use them.Then select the section "Index and select data (indexing, Selecting data)". Create a new Jupyter notebook, write and execute the code, and then explore the different actions you learned. T
The pandas Series is much more powerful than the numpy array , in many waysFirst, the pandas Series has some methods, such as:The describe method can give some analysis data of Series :Import= PD. Series ([1,2,3,4]) d = s.describe ()Print (d)Count 4.000000mean 2.500000std 1.290994min 1.00000025% 1.75000050% 2.50000075% 3.250000max 4.000000dtype:float64Second, the bigges
[26]: Beijing 80000.0Hangzhou 60000.0Nanjing NaNShanghai 70000.0Suzhou NaN
The index of Series can be modified locally through replication.
In [27]: obj.index = ['Bob', 'Steve', 'Jeff', 'Ryan']In [28]: objOut[28]: Bob 4Steve 7Jeff -5Ryan 3
DataFrame
Pandas reads files
In [29]: df = pd.read_table('pandas_test.txt',sep=' ', names=['name', 'age'])In [30]: dfOut[30]: name age0 Bob 261 Loya 222 Denny 203 Mars 25
DataFrame column selection
df[name]
In [3
FalseHangzhou FalseShanghai FalseSuzhou True
An important feature of Series is to automatically align data with different indexes in data operations.
In [24]: obj3Out[24]: Beijing 40000Hangzhou 30000Nanjing 26000Shanghai 35000In [25]: obj4Out[25]: Beijing 40000.0Hangzhou 30000.0Shanghai 35000.0Suzhou NaNIn [26]: obj3 + obj4Out[26]: Beijing 80000.0Hangzhou 60000.0Nanjing NaNShanghai 70000.0Suzhou NaN
The index of Series can be modified locally through replication.
In [27]: obj.index = ['Bob',
) New_titanic_survival= Titanic_survival.dropna (subset=[' Age','Body','home.dest'])Multi-line IndexThis is the original titanic_survival.After I deleted the rows with the Body column Nan, the data becomes the following New_titanic_survival = Titanic_survival.dropna (subset=["body"])Visible, in the New_titanic_survival table, the row's index remains the same as before, and is not recalculated from 0. In the previous article, Pandas (i), you can know t
PandasPandas is a popular open source Python project that takes the name of panel data and Python data analysis.Pandas has two important data structures: Dataframe and seriesThe dataframe of PANDAS data structurePandas's DATAFRAME data structure is a tagged two-dimensional object that is very similar to Excel spreadsheets or relational data tables.You can create dataframe in the following ways:1. Create a dataframe from another dataframe2. Generate Da
Python traversal pandas data method summary, python traversal pandas
Preface
Pandas is a python data analysis package that provides a large number of functions and methods for fast and convenient data processing. Pandas defines two data types: Series and DataFrame, which makes data operations easier. Series is a one-di
This article mainly introduces you to the pandas in Python. Dataframe to exclude specific lines of the method, the text gives a detailed example code, I believe that everyone's understanding and learning has a certain reference value, the need for friends to see together below. When you use Python for data analysis, one of the most frequently used structures is the dataframe of pandas, about
-04-14 4 52013-04-15 1 2 182013-04-17 9 12013-04-18 7 17
Update: If there is no special requirement, it is highly recommended to use LOC with minimal use [], as Loc avoids chained indexing problems when Dataframe is re-assigned, using [] The compiler is likely to give settingwithcopy warnings.
See the official documentation for details: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
Pandas basics, pandas
Pandas is a data analysis package built based on Numpy that contains more advanced data structures and tools.
Similar to Numpy, the core is ndarray, and pandas is centered around the two core data structures of Series and DataFrame. Series and DataFrame correspond to one-dimensional sequences and
Python pandas usage Daquan, pythonpandas Daquan
1. Generate a data table
1. Import the pandas database first. Generally, the numpy database is used. Therefore, import the database first:
import numpy as npimport pandas as pd
2. Import CSV or xlsx files:
df = pd.DataFrame(pd.read_csv('name.csv',header=1))df = pd.DataFrame(pd.read_excel('name.xlsx'))
3. Create a da
This time to bring you pandas+dataframe to achieve the choice of row and slice operation, pandas+dataframe to achieve the row and column selection and the attention of the slicing operation, the following is the actual case, take a look.
Select in SQL is selected according to the name of the column, pandas is more flexible, not only can be selected according to
Teach you how to use Pandas pivot tables to process data (with learning materials) and pandas learning materials
Source: bole online-PyPer
Total2203 words,Read5Minutes.This article mainly explains pandas's pivot_table function and teaches you how to use it for data analysis.
Introduction
Most people may have experience using pivot tables in Excel. In fact, Pandas
Pandas Quick Start (3) and pandas Quick Start
This section mainly introduces the Pandas data structure, this article cited URL: https://www.dataquest.io/mission/146/pandas-internals-series
The data used in this article comes from: https://github.com/fivethirtyeight/data/tree/master/fandango
This data mainly describes
[Data cleansing]-clean "dirty" data in Pandas (3) and clean pandasPreview Data
This time, we use Artworks.csv, And we select 100 rows of data to complete this content. Procedure:
DataFrame is the built-in data display structure of Pandas, and the display speed is very fast. With DataFrame, we can quickly preview and analyze data. The Code is as follows:
import pandas
Pandas data analysis (data structure) and pandas Data Analysis
This article mainly expands pandas data structures in the following two directions: Series and DataFrame (corresponding to one-dimensional arrays and two-dimensional arrays in Series and numpy)
1. First, we will introduce how to create a Series.
1) A sequence can be created using an array.
For example
Data analysis and presentation-Pandas data feature analysis and data analysis pandasSequence of Pandas data feature analysis data
The basic statistics (including sorting), distribution/accumulative statistics, and data features (correlation, periodicity, etc.) can be obtained through summarization (lossy process of extracting data features), data mining (Knowledge formation ).
The. sort_index () method so
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.