No, machine learning are not just glorified Statistics

Last Update:2018-08-22 Source: Internet

Author: User

Tags keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This meme have been all over social media lately, producing appreciative chuckles across the internet as the hype around de EP Learning begins to subside. The sentiment. Learning is really nothing to get excited on, or that it ' s just a redressing of age-old stat Istical techniques, is growing increasingly ubiquitous; The trouble is it isn ' t true.

This comic book, which has been wildly spread on social media, has made a lot of forwarding, suggesting that speculation about machine learning is starting to fade. More and more people are beginning to think that machine learning is really nothing to be excited about, it's just a repackaging of old statistical techniques. The problem, however, is that this is not the case.

I Get it?—? it's not fashionable to being part of the overly enthusiastic, hype-drunk crowd of deep learning evangelists. ML experts who on preached deep learning from the rooftops now use the term only with a hint of chagrin, preferring I Nstead to downplay the power of modern neural networks lest they is associated with the scores of people that still seem t O think that's the import keras leap for every hurdle, and so they, in knowing it, with some tremendous advantage over their C Ompetition.

It can be seen that deep learning spreads the fanatics are not popular. Even the experts, who stand on top of science, now lose a great deal of enthusiasm for using the term, with only a bit of frustration, preferring to downplay the power of modern neural networks and avoid letting a large number of people think import keras they can overcome every obstacle.

Machine learning = representation + Evaluation + optimization

Machine learning are a class of computational algorithms which iteratively "learn" an approximation to som e function. Pedro Domingos, a professor of computer science at the University of Washington, laid out three, and so up a Machine learning algorithm:representation, evaluation, and optimization.

Machine learning is a kind of computational algorithm that uses iterative "learning" methods to approximate a function. Pedro Domingos, professor of computer science at the University of Washington, proposed three components that make up the machine learning algorithm: mapping, evaluation, and optimization.

Representation involves the transformation of inputs from one space to another more useful space which CA n is more easily interpreted. Think of this in the context of a convolutional neural Network. Raw pixels is not useful for distinguishing a dog from a cat, so we transform them to a more useful representation (e.g., Logits from a Softmax output) which can is interpreted and evaluated.
The

Mapping (representation) is the conversion of input from one space to another more useful space. In convolutional neural networks, primitive pixels do little to differentiate between cats and dogs, so we map these pixels into another space (for example, logical values from Softmax output) so that they can be interpreted and evaluated.

The

Evaluation is essentially the loss function. How effectively didn't your algorithm transform your data to a more useful space? How closely did your Softmax output resemble your One-hot encoded labels (classification)? Did you correctly predict the next word in the unrolled text sequence (text RNN)? How far do your latent distribution diverge from a unit Gaussian (VAE)? These questions tell what well your representation function is working; More importantly, they define what it would learn to do.
The essence of the

Evaluation (Evaluation) is the loss function. Does your algorithm effectively transform data into another, more useful space? Is your output in Softmax similar to the classification results encoded in One-hot? Did you correctly predict the next occurrence of the word (text rnn) in the expanded text sequence? How much does your potential distribution differ from the unit Gauss (VAE)? The answers to these questions can tell you if the mapping functions are valid, and more importantly, they define what you need to learn.

Optimization is the last piece of the puzzle. Once You has the evaluation component, you can optimize the representation function in order to improve your evaluation m Etric. In neural networks, this usually means using some variant of stochastic gradient descent to update the weights and biases of your network according to some defined loss function. and voila! You've had the world's best image classifier (at least, if you ' re Geoffrey Hinton in.).

Optimization (optimization) is the last piece of the puzzle. Once you have the method of evaluation, you can optimize the mapping function and then improve your evaluation parameters. In a neural network, this usually means using some random gradient descent variables to update the network weights and deviations based on some defined loss functions. That way, you have the best image classifier in the world (as Jeffrey Sinton did in 2012).

Regression over Million Variables?—? No problem?

Let me also point out of the difference between deep nets and traditional statistical models by their scale. Deep neural networks is huge. The VGG-16 convnet architecture, for example, has approximately 138 million parameters. How does think your average academic advisor would respond to a student wanting to perform a multiple regression of over Million variables? The idea is ludicrous. That's because training VGG-16 is not multiple regression?—? it's machine learning.

I would also like to point out that one of the differences between deep learning networks and traditional statistical models is their scale problems. The scale of the deep neural network is enormous. The VGG-16 convnet architecture has 138 million parameters. What happens if a student tells the instructor to perform a multiple linear regression with more than 100 million variables? This is ridiculous. Because VGG-16 is not a multiple linear regression, it is a machine learning tool.

The Unknown Word The first The

Column	Second Column
Glorified	Beautify the
Statistics	Statistics
Appreciative	Grateful for
Ubiquitous	Ubiquitous, ubiquitous [Ju ' bikwites]

Reference

Transmission

No, machine learning are not just glorified Statistics

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More