Introduction to machine learning one-dimensional linear regression

Source: Internet
Author: User

Finishing the Machine Learnig course from Andrew Ng Week1

Directory:

    • What is machine learning
    • Supervised learning
    • Non-supervised learning
    • Unary linear regression
      • Model representation
      • Loss function
      • Gradient Descent algorithm

1. What is machine learning

Arthur Samuel is not a playing checker master, but he made up a program, every day and this program playing checker, later this program finally became particularly powerful, can win a lot of very powerful people. So Arthur Samuel to machine learning under a more old, less formal definition:

"The field of study that gives the computer the ability to learn without being explicitly programmed"

A modern, more formal definition is:

" A computer program was said to learn from experience E with respect to some class of tasks T and performance measure P, if its perfermance at the tasks in T as measured by P, improves with experience E "

That is: The computer program from "Do a series of tasks t get experience E" and "Measure this task do good performance measure P" to learn, the goal of learning is, through these experience E, these tasks t do better, do good evaluation standard is P;

In the example above Arthur Samuel playing checker:

E:arthur Samuel and program many times play checker experience;

T:playing Checker

P: The probability of the program winning in the next game

Machine learning problems can be divided into two categories: "Supervised learning" and "unsupervised learning".

2. Supervised learning

"Given data set and already know what we correct output should look like"

For the relationship between input and output we've got almost one idea.

"Regression" and "Classification"

Regression: The result is sequential, map input to some continuous function (e.g., forecast rate)

Classification: Results are discrete, map input to some discrete function (e.g., predicting whether a house price is greater than a certain value)

3. Non-supervised learning

"Approach problems with little or no ideal, what we result should look like"

For the relationship between input and output, we don't have a concept

"Clustering" and "non-clustering"

Clustering: 1000,000 different gene clusters, group related to lifespan, height ....

Non-clustered: Cocktail party algorithm, find structure in chaotic environment (e.g., to identify a person's voice or background music in various mixed sounds in cocktail parties)

4, one-dimensional linear regression

Model representation

$x ^{(i)}$: input variable

$y ^{(i)}$: Output variable

$ (x^{(i)}, y^{(i)}) $: One training data

$ (x^{(i)}, y^{(i)}); i=1...m$: Training Data Set

$X =y=r$: Input space and output space, here is the same

$h _\theta (x) =\theta_0+\theta_1x$

such as the following:

For supervised learning problems: Given the training data set (x, y), learn a $h: x \rightarrow y$, a good predictor for H (x) is Y

Loss function

The accuracy used to measure h (X) is the average of H (x) and Y difference

$ J (\theta_0,\theta_1) $ = $ \frac{1}{2m} $ $\sum_{i=1}^m$ $ (H_\theta (x^{(i)}-y^{(i)})) ^2 $

This function is called the squared loss function (square error Function/mean square error), which is often used to represent loss functions in regression problems, and is used in non-regression problems.

Here $ \sum_{i=1}^m$ $ (H_\theta (x^{(i)})-y^{(i)}) ^2 $ is the sum of the losses squared, $\frac12$ is for later derivation convenience plus go

Our goal is to find a $\theta_0 and \theta_1$ that make the loss function minimal:

Loss function Visual 1

The following in order to show the loss function, for convenience, let $\theta_0=0$

When $\theta_1=1$, the $J (\theta_1) =0$, the position of the Green fork in the right figure;

When the $\theta_1=0.5$, the $J (\theta_1) =0.~$, probably in the right figure Blue fork position;

When $\theta_1=0$, the $J (\theta_1) =2.~$, probably on the right figure on the y-axis of the black fork there;

Based on the above three points, we know $j (\theta_1) $ is probably the image on the right, when $\theta_1=1$ $j (\theta_1) $ minimum, the left is decremented, the right increment;

Loss function Visual 2

For the above simple loss function, we can also draw on the two-dimensional diagram, but also better understanding, but when the dimension (variable), the picture is not good to draw, such as two-dimensional:

In this case, the usual contour graph is used to represent the loss function:

For the above training data, when $\theta_0=0, \theta_1=360$, $J (\theta_0, \theta_1) $ in the position of the Red fork in the contour map;

When $\theta_0, \theta_1$ as shown on the left, $J (\theta_0, \theta_1) $ in the position of the Green fork in the contour map;

When $\theta_0, \theta_1$ as shown on the left, $J (\theta_0, \theta_1) $ in the contour map of the position of the Blue fork, that is, near the optimal solution of the place, the approximate middle position of the contour line;

Gradient Descent algorithm

So how do we find the optimal solution? Gradient descent algorithm is a method, see previous blog: Gradient descent

Introduction to machine learning one-dimensional linear regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.