First, you must understand what linear regression is,
Linear Linearity: When Y is proportional to X, it is a straight line.
Regreesion RegressionThat is, to study the relationship between several variables, especially when the dependent variable and the independent variable are linear, it is a special linear model. The simplest case is an independent variable and a dependent variable, which is basically wired. This is called a linear regression, that is, the model is y = a + bx + ε. Here X is an independent variable, Y is the dependent variable, and ε is a random error. Generally, it is assumed that the mean value of the random error is 0, and the variance is σ ^ 2 (σ ^ 2 is greater than 0) σ ^ 2 is irrelevant to the value of X. If we further assume that the random error follows a normal distribution, it is called a normal linear model.
Linear regression:Therefore, we can think that linear regression is to give a series of points to fit the curve h (x) = theat + theatx (both linear and nonlinear actually mean one thing, are looking for appropriate parameters to meet the rules of existing data. The quasi-sum equation (model) is generally used for the calculation of the Inner Difference or a small-range error)
Likelihood function:I understand this. For example, we know a probability distribution density function of X, but this probability distribution has unknown parameters, but I want to get this unknown parameter (theat ), then we can find many known variables and multiply these probability distribution density functions. This is the likelihood function.
Maximum Likelihood function:After knowing the likelihood function, we need to find this unknown parameter. We need this parameter to maximize the likelihood function, that is, the maximum probability distribution.
Analysis process for Linear Regression Problems:
A function model is provided. This function model has many unknown parameters, and we can substitute a lot of observation data. However, this is difficult to solve the equation after the substitution, so we use the approximate solution, converts it to solving the problem of minimizing errors. After listing the error items, use the gradient descent or Newton Method to Solve the minimum value and determine the unknown parameters.
Logistic/sigmoid regreesion mode:
By using a specific function, the linear regression problem is transformed into a classification problem, that is, by using this function, the value of Y is within the range of 0-1.
Expected Risk (real risk)It can be understood as the average loss degree of data or the error degree of "average" when the model function is fixed. The expected risk depends on the loss function and probability distribution.
The expected risk cannot be calculated only for samples.
ThereforeExperience riskTo estimate the expected risk and design a learning algorithm to minimize it. That is, empirical risk minimization erm, Which is evaluated and computed by the loss function.
For classification problems, empirical risks are used to determine the training sample error rate.
For function approximation, fitting problems, and empirical risks, square training errors are involved.
For Probability Density Estimation, ERM is the maximum likelihood estimation method.