"Original" depth neural network (deep neural Networks, DNN)

Source: Internet
Author: User
Tags dnn

The linear model expresses the correspondence between the "result-feature set" through the current combination of features. Because of the limited expression ability of linear model, in practice, the model can only be optimized by increasing the complexity of "feature calculation". For example, in the Ad CTR estimate application, in addition to the "title length, description length, precedence, advertising Id,cookie" such as simple primitive features, there are a lot of combination features (such as "Precedence-cookie" indicates the user's preference for precedence). In fact, many search engine advertising systems now use the logistic regression model (linear), and one of the most important tasks of the model team is "feature Engineering (feature Engineering)".

The idea of linear model is "simple model + complex feature", which realizes complex nonlinear scene description with such combination. Because of the simple structure of the model, the training/pre-estimation cost of this approach is relatively small, but the selection of features is a labor-intensive task, and requires the relevant people to have a deeper understanding of the business.

Another way of thinking about model work is "complex model + simple features". That is, to weaken the importance of feature engineering and to use complex nonlinear models to learn the relationship between features and to enhance their expressive ability. The deep neural network model is such a non-linear model.

is a deep neural network with an input layer, an output layer, and two hidden layers. The model has 9 nodes.

Introduction of neural network Many literatures are very detailed, now for example, focus on the derivation of the backpropagation algorithm process.

BackPropagation is very similar to the gradient method, which essentially seeks the partial derivative of each parameter, and then finds the next search point in the direction of the partial derivative, taking ${w_{04}}$ as an example:

By merging the above derivation, the gradient direction of the ${w_{04}}$ can be obtained:

Other iterative processes and gradient descent methods have little difference.

It is noteworthy that although the requirements of DNN on the characteristics of the project is relatively low, but the training time is more complex, the shear weight can be interpreted very poorly, not easy to debug. Therefore, for a new application, a better approach is to start with a linear model such as the logistic regression, and then try to dnn the model after the iteration is mature.

"Original" depth neural network (deep neural Networks, DNN)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.