Neural Networks for machine learning by Geoffrey Hinton (or both)

Source: Internet
Author: User
Tags scalar

The problem that machine learning can solve well

    • Recognition mode
    • Identify exceptions
    • Pre-measured

Brain work mode

Human beings have a neuron, each of which includes a weight, and the bandwidth is much better than a workstation.


Different types of neurons

Linear (linear) neurons



Binary threshold (two-valued) neurons

Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqv/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity /center "Width=" >



\begin{array}{l}z = B + \sum\limits_i^n {{x_i}{w_i}} \\y = \left\{\begin{array}{l}\begin{array}{*{20}{c}}1&{z \ge 0} \end{array}\\\begin{array}{*{20}{c}}0&{otherwise}\end{array}\end{array} \right.\\\theta =-B\end{array} ">


ReLu (rectified Linear Units) neurons


Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqv/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity /center "Width=" >




Sigmoid neurons

Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqv/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity /center "Width=" >

\begin{array}{l}z = B + \sum\limits_i^n {{x_i}{w_i}} \\y = \frac{1}{{1 + {e^{-z}}}}\end{array} ">


Stochastic binary (random two-valued) neurons

Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqv/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity /center "Width=" >


\begin{array}{l}z = B + \sum\limits_i^n {{x_i}{w_i}} \\p\left ({s = 1} \right) = \frac{1}{{1 + {e^{-z}}}}\end{array} "&G T


Different types of learning tasks

Supervised learning (supervised learning)

Given the input vector. Learn how to predict output vectors.

For example: Regression and clustering.


Reinforcement learning (Enhanced learning)

Learn how to choose Actions to maximize payoff (benefits).

The output is an action, or sequence of actions. The only supervisory signal is a scalar feedback .

The difficulty is that feedback is largely delayed , and that a scalar includes a very limited amount of information.


Unsupervised learning (unsupervised learning)

A good intrinsic expression of the input is found.

Provides a compact, low-dimensional representation of the input.


Provides an economic high-dimensional representation of input by the characteristics that have been learned.

clustering is an extremely sparse form of encoding. There are only one-dimensional non-0 features .



Different types of neural networks

Feed-forward Neural Networks (forward propagation neural network)

More than one layer of hidden layer is the deep neural network.


Recurrent networks (recurrent neural network)



More credible in biology.

Use RNN to model a sequence:

Equivalent to a very deep network, each layer of hidden layer corresponding to a time slice.

The hidden layer has the ability to memorize long-time information.




Perception machine from a geometrical point of view

Weight-space (Weighted space)

Each weight corresponds to the space one dimension.

Each point of space corresponds to a specific weight selection.

Ignoring biased items, each training sample can be treated as a hyper-plane of an over-origin point.

Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqv/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity /center "Width=" >


Taking all the training samples into consideration, the feasible solution of the weights is inside a convex cone .




Two-valued neurons can't do it.


With OR

Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqv/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity /center "Width=" >


Circular simple Pattern recognition

Watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqv/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity /center "Width=" >


Regardless of mode A or pattern B, each time the entire training set runs out, the neuron gets 4 times times The total weight of the input.

No matter what the difference. There is no way to differentiate between the two (non-cyclic mode can be identified).


Using hidden neurons

Linear neurons are also linear, and no network learning ability is added.

The nonlinearity of the fixed output is not enough.

The weights of learning hidden layers are equivalent to the learning characteristics.


Welcome to the discussion and attention this blog and the Micro Blog as well as the personal homepage Maybe content continue to update Oh ~

reproduced please respect the author's labor, complete reservations The above text as well article links , thank you for your support!

Neural Networks for machine learning by Geoffrey Hinton (or both)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.