Deep learning-Start with the code

Last Update:2018-07-26 Source: Internet

Author: User

Tags knowledge base

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Preface

At present, deep learning to grab enough eyeballs and attention, from the layout of major companies, to the springing out of a wave of start-up companies, and then to all kinds of popularization, in-depth analysis of the relevant public number, every day there are a large number of technology, paper interpretation related articles, blogs, etc., a variety of information such as flooding into our vision Various paper analysis, various depth analysis and so on, in a lot of information, how we quweicunzhen, absorbing more useful information and knowledge points is a new problem to face.

There are more and more friends, students began to seek career change in this area, feel that their own development industry seems to be outdated technology, must embrace deep learning this technology tuyere. Embrace change, the pursuit of change is good, but if the taste of the pursuit of the tuyere, the pursuit of technical hotspots, not dialectical to see eventually will delay their own development. " algorithmic engineer" or "Algorithmic application Engineer"

Around a lot of friends are from the start of the service, usually solve a lot of problems are engineering-related problems. I know the bat and the students working abroad, although the "algorithm engineer" related to the title, but the vast majority are directly using the company or the existing computing platform in the industry, the existing algorithm for use, and even a small part of the current algorithm will be larger transformation. So most of the people are actually algorithm application engineers, more time is in the processing of data, such as parameter adjustment.

Then again, why do I have to stress this problem, really because in the actual work environment, to be able to paper, some groundbreaking transformation is really very few, most of the combination of use. For example, the original depth model behind a logical regression, the use of SVM instead of the original 5-layer neural network, changed to LSTM and so on, only one is to solve the problem, improve the effect.

So I will write some articles later, to talk about the knowledge content of deep learning from the programmer's point of view, of course, the main is to learn from git on the existing code to explain. Its purpose is three, the first is to make the majority of service-related students recognize the "skill" of the similarities, the so-called algorithm engineering can never be separated from engineering, the valuable experience in the field of engineering to learn and use algorithmic tools a great advantage; the second is to communicate with more people, exercise the ability to simplify the problem, Hope to be able to express the problem through the form of code clearly; The third is to accumulate a document of experience, write more and more less blog, hope that the back can take more time to write, so that the writing more smooth, more clear logic. a project that cannot be bypassed

Remember in 14 in school, that time from the company internship back, after the double Xi., and two times the baptism of big promotion, many middleware products are very familiar with, how to deal with high concurrency, how to improve the availability of programs have a certain understanding and improve. So in the solution of practical problems will be more or less to think about this aspect. At that time, the laboratory in the parallel deep learning platform, is to build their own distributed deep learning system, at that time we are still using Caffee, and TensorFlow just out but has not supported the distributed. The practice is the traditional PS mode (Parameter server parameter server mode), but the actual construction of the process encountered a lot of engineering problems, such as the beginning of the students set up the parameters of the server is written dead, do not consider the parameters of server downtime. In the process of parameter transfer, the data volume is too large, the data volume is too large and the problem of serialization. Take the following two questions as an example:

How to ensure high availability of the parameter server:

If you don't know ZK (ZooKeeper), then I really don't know how to solve this problem in such a complicated way.

2. Compression and transmission issues

During the interaction between the worker and the parameter server, it involves the synchronization of a large number of parameter information, that is, the worker needs to synchronize the batch updated data to the parameter server, and the parameter server will send the synchronized results to the worker. There are a lot of engineering optimization points here, such as up and down parameter passing if the original value is passed, then the amount of data is huge, then what compression algorithm, or what data structure (can consider the difference) is to try or dynamic decision. How cached data is designed to speed up data acquisition This is an engineering problem to solve.

Also in dealing with natural language-related problems, a lot of engineering means are necessary, and even solve the problem far more than the problem solved by the algorithm itself. Whether in school or at work, such engineering tools are the first choice to solve problems, such as in school, some knowledge base of the question and answer work, the use of a lot of search technology to help solve problems (such as QP, query rewrite, search hints), using the rule engine to solve high-frequency problems. So the engineering means is necessary, this is the vast number of service-side development of the students are good at, do not abandon their own advantages to fully embrace other things. "Technology" is more critical to the "technique", "skill" is only a different degree of proficiency, and "surgery" there are similarities. content involved

The following day will spare some time to some git on the deep Learning related code, examples to explain, I hope to be able to get started in this direction of the development of students to help. At the same time, some problems will be summarized, mainly involving Lstm, CNN, Autoencoder, SEQ2SEQ, and computer vision related to the main algorithm (AlexNet, ResNet, vggnet, etc.), Some common functions of tensorflow, Summary of common concepts (such as convolution, pooling, gate, dropout, full connection, activation function, etc.)

Also learn a lot of formulas and deep learning related to the paper is complex, so that it is interested in the development of students are daunting, but the code is simple, pure, understanding deep learning, mastering deep learning from the beginning of the code is a good way.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More