Deep Learning: Introduction to Keras (a) Basic article _ depth study

Source: Internet
Author: User
Tags shuffle theano keras
Http://www.cnblogs.com/lc1217/p/7132364.html

1. About Keras

1) Introduction

Keras is a theano/tensorflow-based, in-depth learning framework written by pure Python.

Keras is a high level neural network API that supports fast experiments that can quickly turn your idea into a result, and you can choose Keras if you have the following requirements:

A simple and rapid prototyping design (Keras with highly modular, minimalist, and extensible features)

b support CNN and RNN, or the combination of both

C Seamless CPU and GPU switching

2) Design principles

A user friendliness: Keras is an API designed for humans and not for the Zenith star. The user experience is always the primary and central content of our consideration. Keras follow best practices to reduce cognitive difficulties: Keras provides a consistent and concise API that greatly reduces the workload of users under general application, while Keras provides clear and practical bug feedback.

B Modular: Model can be understood as a layer of sequence or data of the operation diagram, fully configurable modules can be combined at a minimum cost free. Specifically, network layer, loss function, optimizer, initialization strategy, activation function, regularization method are all independent modules, and you can use them to build your own model.

c) Scalability: It's super easy to add new modules, just write new classes or functions in the same mode as the existing ones. The convenience of creating new modules makes Keras more suitable for advanced research work.

D collaboration with Python: Keras does not have a separate model profile type (in contrast, Caffe), the model is described by Python code, makes it more compact and easier to debug, and offers extended convenience.

2.KERAS module Structure

3. Use Keras to build a neural network

4. Key Concepts

1) Symbolic calculation

The underlying library of Keras uses Theano or TensorFlow, which is also known as the back end of the Keras. Whether it is Theano or tensorflow, it is a "symbolic" library. The symbolic computation first defines various variables, and then establishes a "calculation diagram", which stipulates the computational relationship between the variables.

Symbolic calculation is also called Data Flow diagram, the process is as follows (GIF is not easy to open, so use a static diagram, the data is in the picture of Black with an arrowhead line):

2) tensor

Tensor (tensor), which can be regarded as a natural generalization of vectors and matrices, is used to represent a wide range of data types. The order of the tensor is also called dimension.

0 order tensor, that is, a scalar, is a number.

1 order tensor, a vector, a set of ordered numbers

2 order tensor, i.e. matrix, a set of vectors arranged in order

3 order tensor, that is, a cube, a set of matrices up and down

4-Step tensor ...
By analogy

Focus: Understanding of dimensions

If there is a 10-length list, then we look at the horizontal 10 numbers, can also be called 10 dimensions, vertical to see only 1 digits, then called 1 dimensions. Note that this distinction helps to understand the dimensional problems that arise when computing in keras or neural networks.

3) data Format (Data_format)

There are two main ways to represent the tensor:
a) th mode or Channels_first mode, Theano and Caffe use this pattern.
b) tf mode or Channels_last mode, TensorFlow use this mode.


The following examples illustrate the difference between the two modes:
For 100 RGB3 channels of 16x32 (height 16 width to 32) color map,
Th expression mode: (100,3,16,32)
TF representation: (100,16,32,3)
The only difference is that the position of the channel number 3 is different.

4) Model

There are two types of keras models, sequential models (sequential) and functional models (model), functional models are more widely used, and sequential models are a special case of functional models.
A sequential model (sequential): Single input single output, a path to the end, layer and layer only adjacent relations, there is no cross-layer connection. This model is faster to compile and simpler to operate
b) Functional model: multiple input and multiple outputs, with any connection between layers and layers. This model compiles at a slow rate.

5. The first example

This is also used in the introduction of neural networks, a common example: the recognition of handwritten digits.

Before writing the code, introduce some concepts based on this example to facilitate understanding.

PS: May be the issue of version differences, the official website of the parameters and examples of the parameters are not the same, the official website gave a few parameters, and some parameters support, some do not support. So this example removes the unsupported arguments and only describes the parameters used in this example.

1) Dense (500,input_shape= (784,))

A the dense layer belongs to a layer in the--> common layer of the network layer

b) 500 indicates the dimension of the output, and the complete output represents: (*,500): The output of any 500-d data stream. But in the parameter only writes the dimension to be possible, compares the concrete output how many is has the input to determine. In other words, the output of dense is actually a Nx500 matrix.

c) Input_shape (784,) indicates that the input dimension is 784 (28x28, which is described later), and the Complete input representation: (*,784): That is, input n 784 dimension Data

2) Activation (' Tanh ')

A) Activation: Activation layer

b) ' Tanh ': Activation function

3) Dropout (0.5)

The input neurons of a certain percentage (rate) are randomly disconnected each time the parameter is updated during the training process to prevent the fitting.

4) Data Set

The dataset consists of 60000 28x28 training sets and 10000 28x28 test sets and their corresponding target numbers. If the above data format is fully stated, the TensorFlow as the backend should be (60000,28,28,3), because Mnist.load_data () is used in the example to get the dataset, so it has been judged that the tensorflow is used as the backend, So the dataset becomes (60000,28,28), then Input_shape (784,) should be Input_shape (28,28), but in this case it is wrong to write it, and it needs to be converted to (60000,784). Why do I need to convert?

As shown above, the training set (60000,28,28) as input, is equivalent to a cube, and the input layer from the current point of view is a plane, the cube data flow into the plane of the input layer to calculate it. So you need to perform the transformation shown by the yellow arrows before you enter the input layer for subsequent calculations. As for how to deal with input layer after transforming from 28*28 to 784, we don't need to care about it. (like the study of students can go to the source code).

Also, the Keras input is in the form of (Nb_samples, Input_dim): That is, the number of samples, the input dimension.

5) Sample Code

From keras.models import sequential to Keras.layers.core import dense, dropout, activation from Keras.optimizers Imp
ORT SGD from keras.datasets import mnist import numpy ' first step: Select models ' model = sequential () ' Second step: Build Network layer ' Model.add (Dense (500,input_shape= (784))) # input layer, 28*28=784 model.add (Activation (' Tanh ')) # activation function is Tanh model.add (dropout (0.5)) # using 50% dropout Model.add (dense (500)) # Hidden Layer node 500 model.add (Activation (' Tanh ')) Model.add (Dropout (0.5)) model . Add (Dense (10)) # The output result is 10 categories, so the dimension is ten Model.add (Activation (' Softmax ')) # The last layer uses Softmax as the activation function ' third step: compile ' SGD = SGD ( lr=0.01, Decay=1e-6, momentum=0.9, nesterov=true) # optimization function, set learning rate (LR) and other parameters Model.compile (loss= ' categorical_crossentropy '),
   OPTIMIZER=SGD, class_mode= ' categorical ') # using cross entropy as the loss function ' fourth step: training. Some parameters of fit Batch_size: Group The total number of samples, each group contains the number of samples Epochs: Training times Shuffle: whether the data randomly disrupted after the training validation_split: How much is used to do cross-validation verbose: Screen mode 0: Do not output 1: Output Progress 2: Output each training results ' ' (X_train, Y_train), (x_test, y_test) = Mnist.load_data () # Read data using Keras's Mnist tool (first need networking) # because the input data dimension of mist is (Num, 28, 28), it is necessary to spell the following dimensions directly into 784-D X_tra in = X_train.reshape (X_train.shape[0], x_train.shape[1] * x_train.shape[2]) X_test = X_test.reshape (X_test.shape[0), X_ TEST.SHAPE[1] * x_test.shape[2]) Y_train = (Numpy.arange () = = y_train[:, None]). Astype (int) y_test = (Numpy.arange (10 = = y_test[:, None]). Astype (int) model.fit (x_train,y_train,batch_size=200,epochs=50,shuffle=true,verbose=0, validation_split=0.3) model.evaluate (X_test, Y_test, batch_size=200, verbose=0) ' Fifth step: Output ' Print (' Test set ') Scor Es = model.evaluate (x_test,y_test,batch_size=200,verbose=0) print ("") Print ("The Test loss is%f"% scores) result = Model . Predict (x_test,batch_size=200,verbose=0) Result_max = Numpy.argmax (result, Axis = 1) Test_max = Numpy.argmax (Y_test, ax is = 1) Result_bool = Numpy.equal (Result_max, test_max) True_num = Numpy.sum (result_bool) print ("") print ("The Accuracy O f The model is%f% (True_num/len resulT_bool))) 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.