標籤:sklearn roc auc python
preface:最近《生物資訊學》多次談到AUC,ROC這兩個指標,正在做的project,要求畫ROC曲線,sklearn裡面有相應的函數,故學習學習。
AUC:
ROC:
具體使用參考sklearn:
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html
http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc_crossval.html#example-model-selection-plot-roc-crossval-py
http://www.tuicool.com/articles/b22eYz(博友部落格)
#coding:utf-8print(__doc__)import numpy as npfrom scipy import interpimport matplotlib.pyplot as pltfrom sklearn import svm, datasetsfrom sklearn.metrics import roc_curve, aucfrom sklearn.cross_validation import StratifiedKFold################################################################################ Data IO and generation,匯入iris資料,做資料準備# import some data to play withiris = datasets.load_iris()X = iris.datay = iris.targetX, y = X[y != 2], y[y != 2]n_samples, n_features = X.shape# Add noisy featuresrandom_state = np.random.RandomState(0)X = np.c_[X, random_state.randn(n_samples, 200 * n_features)]################################################################################ Classification and ROC analysis#分類,做ROC分析# Run classifier with cross-validation and plot ROC curves#使用6折交叉驗證,並且畫ROC曲線cv = StratifiedKFold(y, n_folds=6)classifier = svm.SVC(kernel='linear', probability=True, random_state=random_state)mean_tpr = 0.0mean_fpr = np.linspace(0, 1, 100)all_tpr = []for i, (train, test) in enumerate(cv):#通過訓練資料,使用svm線性核建立模型,並對測試集進行測試,求出預測得分 probas_ = classifier.fit(X[train], y[train]).predict_proba(X[test]) # Compute ROC curve and area the curve #通過roc_curve()函數,求出fpr和tpr,以及閾值 fpr, tpr, thresholds = roc_curve(y[test], probas_[:, 1]) mean_tpr += interp(mean_fpr, fpr, tpr)#對mean_tpr在mean_fpr處進行插值,通過scipy包調用interp()函數 mean_tpr[0] = 0.0 #初始處為0 roc_auc = auc(fpr, tpr) #畫圖,只需要plt.plot(fpr,tpr),變數roc_auc只是記錄auc的值,通過auc()函數能計算出來 plt.plot(fpr, tpr, lw=1, label='ROC fold %d (area = %0.2f)' % (i, roc_auc))#畫對角線plt.plot([0, 1], [0, 1], '--', color=(0.6, 0.6, 0.6), label='Luck')mean_tpr /= len(cv) #在mean_fpr100個點,每個點處插值插值多次取平均mean_tpr[-1] = 1.0 #座標最後一個點為(1,1)mean_auc = auc(mean_fpr, mean_tpr)#計算平均AUC值#畫平均ROC曲線#print mean_fpr,len(mean_fpr)#print mean_tprplt.plot(mean_fpr, mean_tpr, 'k--', label='Mean ROC (area = %0.2f)' % mean_auc, lw=2)plt.xlim([-0.05, 1.05])plt.ylim([-0.05, 1.05])plt.xlabel('False Positive Rate')plt.ylabel('True Positive Rate')plt.title('Receiver operating characteristic example')plt.legend(loc="lower right")plt.show()
python sklearn畫ROC曲線