) = 0,
2X ^ TXw-2X ^ Ty = 0
X ^ TXw = X ^ Ty
If X ^ TX is full, it is reversible. Therefore, the left side of both sides is multiplied by (X ^ TX) ^-1 at the same time.
Therefore:
W = (X ^ TX) ^-1) X ^ Ty, that is, the preceding result.
The following is our Python code:
#-*-Coding: UTF-8-*-"Created on Tue Oct 10 23:10:00 2017 Version: python3.5.1 @ author: Stone" "import pandas as pdfrom numpy. linalg import invfrom numpy import dot # regular equation method # fitting linear model: Sepal. length
Pandas Select Data Iloc and LOC are not used the same way, Iloc is based on the index, LOC is based on the value of the row>>>importpandasaspd>>>importos>>>os.chdir ("d:\\") >>>d=pd.read_csv ("Gwas_water.qassoc",delimiter= "\s+") >> >d.loc[1:3]CHRSNPBPNMISS BETASER2 tp11. 447440.18000.17830.02369 1.0090.318521.449 440.27850.24730.029311.1260.26653 1.452440.1800 0.17830.023691.0090.3185>>>d.loc[0:3]chrsnp BP
function value def cost (Theta, x, y): Theta = NP. matrix (theta) x = NP. matrix (x) y = NP. matrix (y) Part1 = NP. multiply (-y, NP. log (sigmoid (x * Theta. t) Part2 = NP. multiply (1-y), NP. log (1-sigmoid (x * Theta. t) return NP. sum (part1-part2)/Len (x) # Add one column before the original matrix 1st to all 1data. insert (0, 'ones', 1) Cols = data. shape [1] x = data. iloc [:, 0: Cols-1] Y = data. iloc
) New_titanic_survival= Titanic_survival.dropna (subset=[' Age','Body','home.dest'])Multi-line IndexThis is the original titanic_survival.After I deleted the rows with the Body column Nan, the data becomes the following New_titanic_survival = Titanic_survival.dropna (subset=["body"])Visible, in the New_titanic_survival table, the row's index remains the same as before, and is not recalculated from 0. In the previous article, Pandas (i), you can know that pandas uses the loc[m] function to index
The following for you to share a pandas implementation of the selection of a specific index of the row, has a good reference value, I hope to be helpful to everyone. Come and see it together.
As shown below:
>>> Import numpy as np>>> import pandas as pd>>> Index=np.array ([2,4,6,8,10]) >>> Data=np.array ([3,5,7,9,11]) >>> DATA=PD. DataFrame ({' num ':d ata},index=index) >>> print (data) num2 910 11> >> select_index=index[index>5]>>> Print (select_index) [6 8 10]>>> data[' num '].loc[sel
Ming 6.0 - Name:price, Dtype:float64 -Zhang San 1.2 theReese 1.0 -Harry 2.3 -Chen Jiu 5.0 -Xiao Ming 6.0 +Name:price, Dtype:float64 In general, we often need to value by column, then Dataframe provides loc and Iloc for everyone to choose from, but the difference is between the two.1 Print(frame2)2 Print(frame2.loc['Harry'])#Loc can use the index of the string type, whereas the Iloc can only be of type int
([arr, arr], Axis=1) # Connect two arr, in the direction of the row---------------Pandas-----------------------Ser = series () Ser = series ([...], index=[...]) #一维数组, dictionaries can be converted directly to Seriesser.values ser.index Ser.reindex ([...], fill_value=0) #数组的值, index of array, redefine index ser.isnull () pd.isn Ull (Ser) pd.notnull (Ser) #检测缺失数据ser. name= ser.index.name= #ser本身的名字, ser index name Ser.drop (' x ') #丢弃索引x对应的值ser +ser #算术运算ser. Sort_index () Ser.order () # Sort b
, df. name dfname from v $ tablespace ts, v $ datafile df where ts. ts # = df. ts #;
TSNAME DFNAME----------------------------------------------------------------------SYSTEM/u01/app/oracle/oradata/orcl/system01.dbfUNDOTBS1/u01/app/oracle/oradata/orcl/undotbs01.dbfSYSAUX/u01/app/oracle/oradata/orcl/sysaux01.dbfUSERS/u01/app/oracle/oradata/orcl/users01.dbfEXAMPLE/
member's situation (party-party, D stands for the Republican party, R stands for the Democratic party, and I stands for the non-partisan party, the third column represents the vote of a certain bill. 1 stands for favor, 0 stands for opposition, and 0.5 stands for waiver)
import pandasvotes = pandas.read_csv('114_congress.csv')
Print (votes ["party"]. value_counts ())
From sklearn. metrics. pairwise import euclidean_distancesprint (euclidean_distances (votes.
) data = PD. DataFrame (Np.random.randn (6,4), index=dates,columns=list (' ABCD ')) print data #输出A列中大于0的行 print Data[data. A > 0] #输出大于0的数据, less than or equal to 0 with a Nan complement print Data[data > 0] #拷贝data data2 = data.copy () print data2 tag = [' A '] * 2 + [' B '] * 2 + [' C '] * 2 #在data2中增加TAG列用tag赋值 data2[' tag ' = tag print data2 #打印TAG列中为a, C's line print data2[data2. Tag.isin ([' A ', ' C '])]
Some of the 8.DataFrame operations (6)
Import NumPy as NP import pandas as PD d
[Machine Learning] data preprocessing: converting data of different types into numerical values and preprocessing Data Conversion
Before performing python data analysis, you must first perform data preprocessing.
Sometimes I have to deal with non-numeric data. Well, what I want to talk about today is how to deal with the data.
Three methods are available:
1. Use LabelEncoder for fast conversion;
2. Use mapping to map a category to a value. However, this method has limited applicability;
3. Use t
concil_set:if each in ans_attend_set:c Oncil_attend_set.add (each) elif each of Ans_notatt_set:concil_notatt_set.add (each) else:concil_n Otans_set.add (each) #3. Display result Def disp (SS, cap, num = True): #ss: List set #cap: Opening description print (Cap, ' ({}) '. Format (len (ss))) for I in rangE (Np.ceil (LEN (ss)/5). Astype (int)): Pre = i * 5 NEX = (i+1) * 5 #调整显示格式 dd = ' for Each in list (ss) [Pre:nex]: If Len (each) = = 2:DD = dd + "+ each Elif len" (ea ch) = = 3:DD = dd + ' + eac
rate
names = [' Bob ', ' Jessica ', ' Mary ', ' John ', ' Mel ']
births = [968, 155, 77, 578, 973]
Use the zip function to merge the two lists together.
# Check the zip function's help
zip?
Babydataset = List (zip (names, births))
Babydataset
[(' Bob ', 968), (' Jessica ', "), (' Mary ',), (' John ', 578), (' Mel ', 973)]
We have completed the creation of a basic dataset. We now use Pandas to export this data to a CSV file.
DF is a dataframe obj
Generally, Unix administrators have a set of common tools, techniques, and systems for assisting process management. This article provides a variety of key utilities, command line chains, and scripts used to simplify each process. Some of these tools come from the operating system, and most of the skills come from long-term experience and requirements for reducing the workload of system administrators. This seriesArticleFocuses on maximizing the use of tools available in a variety of UNIX enviro
Machine learning: Predicting Google stock using Scikit-learn's linear regression
This is the first article in the Machine Learning series.This article will Python use scikit-learn the linear regression to predict Google's stock trend. Please do not expect this example to make you a stock master. Here's how to do it in step-by.Preparing dataThe data used in this article comes from the www.quandl.com site. Using the Python appropriate quandl library, you can get the data we want with a few si
Data import:Import CSV fromPandasImportREAD_CSV;DF= Read_csv ('D://pa//4.1//1.csv') to import text, to be converted to UTF-8 No BOM format: fromPandasImportREAD_TABLE;DF= Read_table ('D://pa//4.1//2.txt') Import Excle fromPandasImportREAD_EXCEL;DF= Read_excel ('c:/pa/4.1/3.xlsx')View CodeData export: Export of data: fromPandasImportDATAFRAME;
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.