Logistic Regression vs Decision Trees vs Svm:part I

Last Update:2015-11-24 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Classification is one of the major problems, the we solve while working in the business problems across industries. In this article we'll be discussing the major three of the many techniques used for the same, Logistic Regression, Decisio n Trees and support Vector machines [SVM].

All of the above listed algorithms is used in classification [SVM and decision Trees is also used for regression, but W E is not discussing that today!]. Time and again I had seen people asking which one to choose for their particular problem. Classical and the most correct but least satisfying response to that question is "it depends!". Its downright annoying, I agree. So I decided to shed some light on it dependson.

Its-a very simplified 2-d explanation and responsibility of extrapolating this understanding to higher dimensional data, p Ainfully lies in the reader ' s hand.

I ' ll start with discussing the most important question:what exactly is we trying do in classification? Well, we were trying to classify. [Is that a even serious questions? Really?]. Let me rephrase that response. In order to classify, we try to get a decision boundary or a curve [not necessarily straight], which separates the class Es in our feature space.

Feature space sounds like a very fancy word and confusing to many who haven ' t encountered it before. Let me show you a example which would clarify this. I have a sample data with 3 variables; X1, x2 and Target. Target takes the values 0 and 1, depending on the values taken by Predictor variables X1 and x2. Let me plot the This data for you.

This is your feature space. Where your observations lie. In this case, since we has only a predictors/features, feature space is 2D. Here's can see both classes of your target marked by different colors. We would like our algorithm to give us a line/curve which can separate these both classes.

We can visually see, that's an ideal decision boundary [or separating curve] would be circular. Shape of the produced decision boundary is where the difference lies between Logistic Regression, decision Tress and SVM.

Lets start with logistic regression. Many of us is confused about shape of decision boundary given by a logistic regression. This confusion mainly arises because of looking at the famous s-shaped curve too many times in context of logistic regress Ion.

This blue curve, that's a decision boundary. Its simply in a-transformed response from binary response which we model using logistic regression. Decision boundary of logistic regression is always a line [or a plane, or a hyper-plane for higher dimension]. Best-of-the-convince you'll be, by showing the famous logistic regression equation that's all too familiar with.

Let's assume for simplification, and F is nothing but a linear combination of all the predictors.

The above equation can also be written as:

Now to predict in logistic regression you decide a particular score cutoff for the probabilities, above which your predict Ion would be 1 or 0 otherwise. Lets say that cutoff are C. So your decision process would be like this:

Y=1 if p>c, otherwise 0. Which eventually gives the decision boundary F > constant.

F>constant, here's nothing but a linear decision boundary. Result of the logistic regression for our sample data would be is like this.

You can see the it doesn ' t do a very good job. Because Whatever you does, decision boundary produced by logistic regression would always be linear, which can not emulate a Circular decision boundary which is required. So, the logistic regression would work for classification problems where classes is approximately linearly separable. [Although can make classes linear separable in some cases through variable transformation, but we'll leave that discus Sion for some and other day].

Now lets see how decision trees handle these problems. We know that decision trees is made of hierarchical one variable rules. Such An example for we data is given below.

If you think carefully, these decision rules x2 |</>| const OR x1 |</>| Const does nothing but Partition the feature space with lines parallel to each feature axis like the diagram given below.

We can make our tree more complex by increasing its size, which would result in + and more partitions trying to emulate The circular boundary.

ha! Not a circle but it tried, which much is due. If you keep on increasing size of the tree, you ' d notice that decision boundary would try to emulate circle as much as it Can with parallel lines. So, if boundary are non-linear and can be approximated by cutting feature space into rectangles [or cuboids or Hyper-cuboi D for higher dimensions] then d-trees is a better choice than logistic regression.

Next We ll look at the result of the SVM. SVM works by projecting your feature space into kernel space and making the classes linearly separable. An easier explanation to this process would be, the SVM adds an extra dimension to your feature space in a-to-that makes Classes linearly separable. This planar decision boundary when projected back to original feature space emulates non linear decision boundary. Here is the picture might explain better than me.

You can see the, once a third dimension in a special manner added to data, we can separate both classes with a plane [a Linear separator], which once projected back onto the original 2-d feature space; becomes a circular boundary.

See how to well SVM performs on our sample data:

Note:the decision boundary is not being such a well rounded circle, but rather a very good approximation [a polygon] to I T. We have used simple circle to avoid getting into hassle of drawing a tedious polygon in our software.

OK, so is the difference makes sense, but one question still remains. That's, when to choose which algorithm when dealing with multi dimensional data? This is a very important question because and you'll not have such a convenient method of the visualizing data when there be MO Re than 3 predictors to be considered. We ' ll discuss that in 2nd part of the This post, stay tuned!

Original address

Logistic Regression vs Decision Trees vs Svm:part I

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More