Kaggle Practice 1--titanic

Source: Internet
Author: User
Tags rfc

Recently has the plan through the practice Classics Kaggle case to exercise own actual combat ability, today has recorded oneself to do titanic the whole process of the practice.

Background information:

The Python code is as follows:

#-*-Coding:utf-8-*-"" "Created on Fri Mar 12:00:46 2017 @author: Zch" "" Import pandas as PD from Sklearn.featur
E_extraction Import Dictvectorizer from sklearn.ensemble import randomforestclassifier from xgboost import xgbclassifier From sklearn.cross_validation import Cross_val_score #读取训练数据集和测试数据集 train = pd.read_csv (' e://python/data/titanic/ Train.csv ') test = pd.read_csv (' e://python/data/titanic/test.csv ') selected_features = [' Pclass ', ' Sex ', ' age ', ' Embarked ', ' sibsp ', ' parch ', ' Fare '] x_train = train[selected_features] x_test = test[selected_features] Y_train = train[' Survived '] #填充Embarked缺失值 x_train[' embarked '].fillna (' s ', inplace=true) x_test[' embarked '].fillna (' s ', inplace=true ) #填充Age缺失值 x_train[' age '].fillna (x_train[' age '].mean (), inplace=true) x_test[' age '].fillna (x_test[' age '].mean (), inplace=true) x_test[' Fare '].fillna (x_test[' Fare ')].mean (), inplace=true) #采用DictVectorizer对特征向量化 Dict_vec = Dictvectorizer (sparse=false) X_train = Dict_vec.fit_transform (x_train.to_dict(orient= ' record ')) Print (dict_vec.feature_names_) x_test = Dict_vec.transform (x_test.to_dict (orient= ' record ')) RFC = Randomforestclassifier () #使用默认配置初始化XGBClassifier XGBC = Xgbclassifier () #使用5折交叉验证的方法在训练集上分别对rfc和xgbc进行性能评估, #
Get the score for the average classification accuracy.
Cross_val_score (rfc,x_train,y_train,cv=5). Mean () Cross_val_score (xgbc,x_train,y_train,cv=5). Mean () #使用rfc进行预测操作 Rfc.fit (x_train,y_train) rfc_y_predict = rfc.predict (x_test) rfc_submission = PD. Dataframe ({' Passengerid ': test[' Passengerid '], ' survived ': rfc_y_predict}) #将预测结果存储在文件rfc_submission. csv rfc_
Submission.to_csv (' E:\\python\\data\\titanic\\rfc_sub.csv ', Index=false) #使用xgbc进行预测操作 Xgbc.fit (X_train,y_train) Xgbc_y_predict = Xgbc.predict (x_test) xgbc_submission = PD. Dataframe ({' Passengerid ': test[' Passengerid '], ' survived ': xgbc_y_predict}) #将预测结果存储在文件xgbc_submission. csv xgbc_

 Submission.to_csv (' E:\\python\\data\\titanic\\xgbc_sub.csv ', index=false)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.