Python functions
(1) Another way to define the data frame is to put the data content (multidimensional array) directly into data, and then define columns and index. (Data frame. Columns is a column name,. Index is the row name, and the type that is taken is similar to the tuple, you can use [0],[1] ... Direct removal)
DF = PD. DataFrame (data=[[34, ' null ', ' Mark '], [[a], ' null ', ' Mark '], [", ' null ', ' Mark ']], columns=[' id ', ' temp ', ' name '], index=[ 1, 2,3])
Print (DF)
Results:
ID Temp Name
1 null Mark
2 null Mark
3 NULL Mark
(2) operate on each column of the data frame
For C in Df.columns:
Print ('--%s--'%c)
Print (Df[c].value_counts ())
Results:
--id--
34 2
22 1
Dtype:int64
--temp--
Null 3
Dtype:int64
--name--
Mark 3
Dtype:int64
Use apply and lambda functions without a simple notation for loops
DF1 = Df.apply (Lambdax:x.value_counts (). T.stack ())
Print (DF1)
Note: The entire data frame apply is the operation of each column,T is transpose,thestack is the entry to remove nan . If you are not counting the number of each value, you are manipulating each value in each column.
Results:
ID 22 1
34 2
Temp NULL 3
Name Mark 3
Dtype:float64
You can also remove a single column and manipulate each value in a single column.
DF2 =df[' id '].apply (lambda x:x+1)
Print (DF2)
Results:
1 35
2 23
3 35
Name:id, Dtype:int64
(3) Conversion of characters to strings
Integer string converted to the corresponding integer:int(' 12 ')
Decimal string converted to corresponding decimal:float(' 12.34 ')
Number converted to string:str(123.45)
ASCII code converted to the corresponding character:CHR(97)
Character converted to the corresponding ASCII code:Ord(' a ')
Note: The ASCII code corresponds to the integer type.
(4) Dataframe type conversion
Df.astype (int)
(5) There are two ways to remove a column from the Dataframe
Df.id,
df[' ID ']
1 34
2 22
3 34
Name:id, Dtype:int64
1 34
2 22
3 34
Name:id, Dtype:int64
(6) is to determine whether the same object, Ain B is to determine whether a is an element in B (without calling the package, directly can be used)
A = [1, 2, 3]
b = [1, 2, 3]
c = A
print (A is B) Note: is Determine the memory address, not determine whether it is equal
Print (A is C)
Print (1 in a)
Results:
False
True
True
Isin determines whether the elements in the data frame are in the list, one to judge, as long as there is true, otherwise false.
Df:
ID Temp Name
1 null Mark
2 null Mark
3 NULL Mark
Print (Df.id.isin ([3,34]))
Results:
1 True
2 False
3 True
Name:id, Dtype:bool
(7) Use column names to remove multiple columns from the data frame, make sure to combine the list into a whole, otherwise it will be an error.
D =df[[' id ', ' temp ']
Print (d)
Results:
ID Temp
1 null
2 NULL
3 null
(8) DataFrame. Reset_index (level=none, drop=false, inplace=false, col_level=0, col_fill= ")
Returned is a data frame
| Parameters: |
Level : int, str, tuple, or list, Default None Only remove the given levels from the index. Removes all levels by default Drop : boolean, default False Don't try to insert index into data Frame columns. This resets the index to the default integer index. InPlace : boolean, default False Modify the DataFrame in place (does not create a new object) Col_level : int or STR, default 0 If The columns has multiple levels, determines which level the labels is inserted into. By default it's inserted into the first level. Col_fill : object, default ' If The columns has multiple levels, determines how the other levels is named. If None then the index name is repeated. |
Returns: |
resetted : DataFrame |
print (Df.reset_index ()) Note: the Index added as a column to the data frame
Results:
Index ID Temp Name
0 1 null Mark
1 2 null Mark
2 3 null Mark
Print (Df.reset_index (level=0)) Note: or will Index added as a column to the data frame?? level doesn't seem to work .
Results:
Index ID Temp Name
0 1 null Mark
1 2 null Mark
2 3 null Mark
Print (Df.reset_index (level=0,drop=true)) Note: drop=true not joined Index column
Results:
ID Temp Name
0 NULL Mark
1 null Mark
2 null Mark
(9) A function in a data frame can have a whole data frame, and some columns are just
df1= PD. DataFrame (data=[[2.987, 4, 6, 6], [5, 5, 7, 8], [7, 8, 3,], [6, 3, 66,44], [+, 5, 6, 2]], columns=[' x1 ', ' x2 ', ' x3 ', ' Y '])
Results:
X1 X2 X3 Y
0 2.987 4) 6 6
1 5.000 5) 7 8
2 7.000 8) 3 23
3 6.000 3) 66 44
4 32.000 5) 6 2
Print (Df1.x1.round (2))
Results:
0 2.99
1 5.00
2 7.00
3 6.00
4 32.00
Note: Df1.round (2) will error
print (Df1.sum ()) Note: sum be sure to bring parentheses () in the back, otherwise you won't get an error, but the return result is not the sum of the columns.
Results:
X1 52.987
X2 25.000
X3 88.000
Y 83.000
(Ten) Time.sleep (t), delaying the invocation of the thread's run, by the parameter secs the number of seconds, indicating the time the process was suspended, no return value. T represents the number of seconds deferred execution.
Print (' start:%s '%datetime.now (). Strftime ('%y-%m-%d%h:%m:%s '))
Print (' start:%s '% time.ctime ())
Time.sleep (5)
Print (' end:%s '%datetime.now (). Strftime ('%y-%m-%d%h:%m:%s '))
Print (' end:%s '% time.ctime ())
Results:
START:2016-04-13 17:46:47
start:wed APR 13 17:46:47 2016
END:2016-04-13 17:46:52
end:wed APR 13 17:46:52 2016
How to change the file name extension:
1 , open My Computer, and you will see the menu bar. if not, click the ALT key to appear . Select Tools → Folder Options to open the Folder Options dialog box.
2 , click the View tab, and swipe down to the bottom of the scroll bar.
3 , find the option to hide extensions for known file types, and uncheck state. Click OK, at which point we can change the file's extension.
Python Learning 2016.4.13