Preface: Recently, "Bioinformatics" many times talked about Auc,roc These two indicators, is doing project, request to draw Roc Curve,Sklearn inside have corresponding function, so learn to learn.
Auc:
ROC:
Specific use of reference Sklearn:
Http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html
http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc_crossval.html# Example-model-selection-plot-roc-crossval-py
Http://www.tuicool.com/articles/b22eYz (Bo Friends blog)
#-*-Coding:utf-8-*-"" "Created on Sun Apr 08:57:13 2015 @author: Shifeng" "" Print (__doc__) import NumPy as NP F Rom scipy import interp import Matplotlib.pyplot as plt from Sklearn import SVM, datasets from Sklearn.metrics import Roc _curve, AUC from sklearn.cross_validation import Stratifiedkfold ################################################### ############################ # Data IO and generation, import the iris data, prepare the data for # import some/play with Iris = Datasets.load
_iris () x = iris.data y = iris.target x, y = x[y!= 2], y[y!= 2] #去掉了label为2, the label can only be two points. N_samples, n_features = x.shape # Add Noisy features random_state = np.random.RandomState (0) X = np.c_[x, Random_state.ra NDN (N_samples, N_features)] ############################################################################### #
Classification and ROC analysis #分类, do ROC Analytics # Run classifier with cross-validation and plot Roc curves #使用6折交叉验证, and draw Roc curves CV = Stratifiedkfold (y, n_folds=6) classifier = SVM. SVC (kernel= 'Linear ', Probability=true, random_state=random_state) #注意这里, probability=true, if necessary, otherwise there will be anomalies when predicted.
In addition, the RBF nucleus has a better effect. MEAN_TPR = 0.0 MEAN_FPR = np.linspace (0, 1) ALL_TPR = [] for I, (train, test) in enumerate (CV): #通过训练数据, using SVM linear kernel to model , and the test set is tested to find out the predictive score Probas_ = Classifier.fit (X[train], Y[train]). Predict_proba (X[test]) # Print set (Y[train)) #set ([0,1]) is the label has two categories # print Len (X[train]), Len (X[test]) #训练集有84个, the test set has 16 # print "+ +", Probas _ #predict_proba () function output is the test set in the Lael of various types of confidence, # #在哪个类别上的置信度高, which is divided into what kind of # Compute ROC curve and area th E curve #通过roc_curve () function, find FPR and TPR, and threshold FPR, TPR, thresholds = Roc_curve (y[test), probas_[:, 1]) MEAN_TPR = i Nterp (MEAN_FPR, FPR, TPR) #对mean_tpr在mean_fpr处进行插值, call Interp () function mean_tpr[0 by scipy package = 0.0 #初始处为0 ROC_AUC = AUC (FPR, TPR) #画图, only need Plt.plot (FPR,TPR), variable ROC_AUC only record AUC value, through the AUC () function can be computed Plt.plot (FPR, TPR, lw=1, label= ' ROC fo LD%d (area =%0.2f) '% (I, ROC_AUC)) #画对角线 Plt.plot ([0, 1], [0, 1], '--', color= (0.6, 0.6, 0.6), label= ' Luck ') MEAN_TPR/= Len (CV) #在 mean_fpr100 points, interpolation interpolation at each point for multiple times mean_tpr[-1] = 1.0 #坐标最后一个点为 (1,1) MEAN_AUC = AUC (MEAN_FPR, MEAN_TPR) #计算平均AUC值 #画平均ROC曲 Line #print Mean_fpr,len (MEAN_FPR) #print mean_tpr plt.plot (MEAN_FPR, MEAN_TPR, ' k--', label= ' mean ' ROC (area =%0.2f ) '% Mean_auc, lw=2 plt.xlim ([ -0.05, 1.05]) Plt.ylim ([ -0.05, 1.05]) Plt.xlabel (' False Positive Rate ') Plt.ylabel (' True P Ositive Rate ') plt.title (' Receiver operating characteristic example ') plt.legend (loc= "lower Right") plt.show ()