Leads
The regularization item can take a different form. For example, in the regression problem, the loss function is the square loss, and the regularization term can be the L2 norm of the parameter vector:
Here, the L2 norm of the parameter vector w is represented.
A regularization term can also be a L1 norm of a parameter vector:
This represents the L1 norm of the parameter vector W. Definitions of L1 and L2
L1 is the sum of absolute value and L2 is the sum of squares. A deeper meaning.
L1 Pursuit is sparse, can be understood as the number of variables, L2 mainly used to deal with the problem of fitting, so that each weight parameter value is small.
L2 can speed up training.
Cited:
The L0 norm is the number of non-0 elements in the pointing quantity. If we use the L0 norm to rule a parameter matrix W, is to hope that most of the elements of W are 0 this is too intuitive, too explicit, in other words, let the parameter w is sparse. OK, see the "sparse" two words, everyone should be from the immediate "compression perception" and "sparse coding" in the Wake up, the original use of the "sparse" is through this thing to achieve.