"Machine Learning note four" classification algorithm-Logistic regression

Source: Internet
Author: User
Tags spark mllib

Resources

"1" Spark MLlib machine Learning Practice

"2" Statistical learning methods

1. Logistic distribution

Set X is a continuous random variable, and x obeys a logistic distribution means X has the following distribution function and density function

,。 where u is the positional parameter and γ is the shape parameter. Such as:

The distribution function is symmetrically centered (U,1/2), satisfying: the smaller the shape parameter γ, the faster the center part increases.

2. Logistic regression model

The two-item logistic regression model is a classification model, by the conditional probability P (y| x) indicates that here the random variable x takes the real number, and Y takes 0 or 1. Defined:

And

The logistic regression compares two conditional probabilities, and the X is classified as the one of the larger conditional probabilities. Essentially it is converting the output of a linear function WX + b into a conditional probability.

The multiple logistic regression model is an extension of two models, supporting multiple classification problems, with the following model:

3. Logistic regression Spark Mlib example

 Packagecom.fredric.spark.logisticImportOrg.apache.spark.mllib.classification.LogisticRegressionWithSGDImportorg.apache.spark.mllib.linalg.VectorsImportOrg.apache.spark.mllib.regression.LabeledPointImportOrg.apache.spark. {sparkcontext, sparkconf}/*-* Logistic regression * Fredric*/Object Logistic {def main (args:array[string]): Unit={val conf=NewSparkconf (). Setmaster ("local"). Setappname ("Logistic") Val SC=Newsparkcontext (conf) Val Array=NewArray[labeledpoint] (10)    //constructs the training data, the virtual one classification with the value of 5//for one-dollar, two-item logistic regression classification     for(I <-0 to 9){      if(I >= 5) {Array (i)=NewLabeledpoint (1, Vectors.dense (i))}Else{Array (i)=NewLabeledpoint (0, Vectors.dense (i))} } Val Data=Sc.makerdd (Array); Val Model= Logisticregressionwithsgd.train (data, 50)    //model.weights Output [0.20670127500478114]println (model.weights) var test=-2//when input is-1, the return probability is 0.0//when input is 11 o'clock, the return probability is 1.0Val result =model.predict (vectors.dense (test)) println (Result)//Verify the method//Calculate P (y=1| x), calculate the conditional probability of input x returning 1Val res1= Math.exp (model.weights (0) *test)/(1 + math.exp (model.weights (0) *test)) //Calculate P (y=0| x), calculate the conditional probability of input x returning 0Val Res0 = 1/(1 + math.exp (model.weights (0) *test)) //output: For target:-2 propalitity for 1 is:0.3980965348017618 propalitity for 0 is:0.6019034651982381//According to the comparison of two conditional probabilities, 2 belongs to category 0.println ("for Target:" + Test + "propalitity for 1 are:" + res1 + "propalitity for 0 is:" +res0)}}

Machine learning note Four classification algorithm-logistic regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.