Keras some basic concepts

Source: Internet
Author: User
Tags theano keras keras model
Symbolic Calculation

The underlying library of Keras uses Theano or TensorFlow, both of which are also known as Keras's back end, whether Theano or TensorFlow, a symbolic library.
As for symbolism, it can be generalized as follows: the calculation of symbolism begins with the definition of various variables and then establishes a "calculation chart", which specifies the computational relationship between the variables. The building of a good calculation diagram needs to be compiled to determine its internal details, however, the calculation diagram at this time is a "shell", there is no actual data, only when you put in the input of the required operation, in order to form the entire model data flow, resulting in the output value.
Keras model is this method, after you build Keras model is finished, your model is an empty shell, only the actual generation of callable functions (k.function), input data, will form a real data flow.
The language used to calculate diagrams, such as Theano, is known for difficult debugging, and when Keras's debug enters the Theano level, it is often a headache. Inexperienced developers find it hard to intuitively feel what the calculation diagram is really doing. Despite the headaches, most deep learning frameworks use symbolic computing as a way to provide key computational optimizations, automatic derivation, and so on. tensor

The term is mentioned and explained in the previous concept, and the term itself is a combination of many disciplines, and the use of the word here is relatively simple.
tensor, or tensor, is a term that is often present in this document and is explained in a little bit.
The purpose of this vocabulary is to express unity, tensor can be seen as a natural generalization of vectors and matrices , and we use Zhang Shilai to represent a wide range of data types.
The smallest tensor is a 0-order tensor, or scalar, or a number.
When we arrange some numbers in order, we form a 1-order tensor, which is a vector
If we continue to arrange a set of vectors in order, we form a 2-order tensor, which is a matrix
stack up the matrix, is the 3-order tensor, we can call a cube, a color picture with 3 color channels is a cube like this
stack up the matrix, okay? This time we really did not give it an alias, called the 4-order tensor, do not try to imagine what the 4-order tensor is, it is a mathematical concept. ' th ' and ' tf '

The ' th ' mode , also known as the Theano mode, will represent 100 RGB three-channel 16x32 (up to 16 wide by 32) as the color graph represented in this form (100,3,16,32), and Caffe takes this way. The No. 0 dimension is a sample dimension, representing the number of samples, and the 1th dimension is the Channel dimension, which represents the number of color channels. The back two is high and wide.
The expression of TensorFlow, the ' tf ' mode , is (100,16,32,3), which is to put the channel dimension at the end. The two expression methods are essentially no different. Generic Model

In the original version of Keras, the model actually has two, a kind of called sequential, called the sequential model , that is, single input single output , a path to the end, the layer and layer only the adjacent relationship, There are no cross-layer connections. This kind of model compiles quickly, the operation is also relatively simple. The second model, called graph, is the graph model , which supports multiple input and multiple outputs , and how layers are connected to each other, but are slow to compile. As you can see, sequential is actually a special case of graph.
in this version of the Keras, the graph model is removed, and the functional model API is added, which emphasizes sequential as a special case. The general model is called the models, and then if you want to use simple sequential,ok, there is a shortcut sequential.
Since the functional model API expresses the concept of "generic model", we translate it into a generic model , that is, as long as this thing receives one or some tensor as input, and then outputs one or more sheets, whatever the ghost is. All are called "models". Batch

In fact, I saw this parameter setting before, has not understood, today to tidy up.
The word design to the training process of how to optimize, hey, that is to mention the Deep learning optimization algorithm, white is the gradient drop. There are two ways to update the parameters each time.
The first, traverse all data sets to calculate a loss function, then calculate the function of the gradient of each parameter, update the gradient. This method every time the parameter is updated to see all the samples in the dataset, the computational overhead, the calculation is slow, not support online learning, which is called batch gradient descent, batch gradient drop .
Another, if you look at a data loss function, and then the gradient update parameters, this is called random gradient descent, stochastic gradient descent . This method is faster, but the convergence performance is not very good, may be around the most advantages of dangling, hit less than the most advantageous. The two-time parameter update may also cancel each other out, causing the target function to oscillate more violently.
in order to overcome the shortcomings of the two methods, now generally used is a compromise, mini-batch gradient decent, small batches of gradient descent, this method divides the data into several batches, by batch to update the parameters, so that A group of data in a batch determines the direction of the gradient, which is not easy to run off and reduces randomness. On the other hand, because the number of batches is much smaller than the entire data set, the computational amount is not very large .
Basically now the gradient drop is based on mini-batch, so Keras often appear in the module batch_size, that is to refer to this.
by the way, the optimizer used in Keras SGD is the abbreviation for stochastic gradient descent, but does not mean that a sample is updated one time, or is based on Mini-batch.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.