"Python Data Mining" regression model and its application

Source: Internet
Author: User

linear regression (Linear Regression)

Definition: In supervised learning, the study sample is D ={ (x(i), y(i));i = 1, ..., M }, the predicted result Y(i) is a continuous value variable, need to learn map f:x→y, And it is assumed that there is a linear correlation between the input x and the output Y.

Give a set of data:

where X is a two-dimensional vector in the real field. For example, XI1 is The living area of the first house, Xi2 is the number of rooms in this House.

To perform supervised learning, we need to decide how to represent our functions/assumptions in the computer. We can approximate the use of linear functions to represent.

(Matrix form)

Now, with the training data, how do we pick, or know the value of theta? A credible approach is to make H (x) closer to Y, at least for the example of our training.

Thus, we define a loss function/cost function (loss function/cost functions):

We take the X-to-y mapping function f as the function of θ hθ (x)

There are many types of loss functions, which are selected according to requirements.

Then minimize the loss function , the function is optimized into convex function (often there will only be a global optimal solution, do not worry too much about the algorithm convergence to the local optimal solution).

gradient descent (Gradient descent algorithm )

The fastest speed minimization loss function, compared to how the fastest downhill, that is, each step should be the steepest direction of the slope down, and the steepest slope of the direction is the loss of the corresponding partial derivative of the function.

So the rules of the algorithm iteration are:

Suppose there are now N features, or a variable XJ (J=1...N)

where α is the parameter of the algorithm learning Rate,α the greater the amplitude of each step, the faster the speed will be, but it is possible to repeatedly concussion, resulting in an inaccurate algorithm.

Under-fitting and over-fitting (underfitting and Overfitting)

Under-fitting problem: The characteristic value is few, the model is too simple and insufficient and support.

Overfitting problem: There are a lot of features, the model is very complex, our hypothetical function curve can fit the original data very well, but loss of generality, resulting in a new sample to be predicted, the prediction effect is poor.

Regular items, regularization

The parameter amplitude is controlled by the regular term.

Regular items are available in a variety of ways, often using:

L1 Regular: |θj|

L2 Regular: Θj2

Logistic regression (logistic Regression)

When the linear regression is used to solve the classification problem, the robustness of the model is low and the interference is serious when the noise is encountered.

We can make the appropriate modifications to the old linear regression algorithm to get the function we want.

Introduce the sigmoid function:

The original function hθ (x) is rewritten to get:

Observation function Image discovery: When x is greater than 0 o'clock, the value of Y is greater than 0.5, according to which the predicted value of the linear regression can be compressed within the 0~1 range.

1. Linear decision Boundary:

Suppose the linear function is:,

When hθ (x) > 0 o'clock, the value of G (hθ (x)) is greater than 0.5;

When hθ (x) < 0 o'clock, the value of G (hθ (x)) is less than 0.5;

2. Non-linear decision boundary:

Suppose the function is:

When θ0=0,θ1=0,θ2=0,θ3=1,θ4=1, get the function g (x12+x22-1), the boundary is a circle, the value of the inner point of the circle is less than 0

Define the loss function:

The function is a non-convex function with a local minimum value, and other functions should be selected.

Define the loss function as:

The image of the function is as follows:

We can find this function in:

In Y=1 's positive sample,hθ (x) tends to be 0.99~9, at which point we want to get a smaller price, and when the predicted value is 0.00~1, we want it to be more expensive;

In a negative sample of y=0,hθ (x) tends to be 0.00~1, at which point we want to get a smaller price, and when the predicted value is 0.99~9, we want it to be more expensive;

The loss function can be rewritten as:

Join the regular item:

Ii. Classification and multi-classification

One vs One

One vs Rest

Method One:

1. First classify the triangle and fork, get the classifier C1, and the probability value PC1 (x) and 1-PC1 (x)

2. Then classify the triangles and squares, get the classifier C2, and the probability values PC2 (x) and 1-PC2 (x)

3. Finally classify the square and fork, get the classifier C3, and the probability value PC3 (x) and 1-PC3 (x)

Get through 3 classifiers, 6 probability values, the maximum probability value of the judgment for the corresponding type!

Method Two:

1. First classify the triangle, determine whether it is a triangle, get the classifier C1, and the probability value PC1 (x)

2. Then classify the square, determine whether it is a square, get the classifier C2, and the probability value PC2 (x)

3. Finally, the Fork fork is classified to determine whether it is a fork fork, to get the classifier C3, and the probability value PC3 (x)

Get 3 classifiers, 3 probability values, the maximum probability value for the corresponding type of judgment!

Application:

Cond.....

"Python Data Mining" regression model and its application

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.