This tutorial uses lasagne, a tool based on Theano to quickly build a neural network:
1, the realization of several neural network construction
2, Discussion data augmentation method
3, discuss the importance of learning "potential"
4, Pre-discussion training (pre-training)
The above approach will help to improve our results.
This tutorial is based on a certain understanding of neural networks, as there is no longer a discussion of how neural networks work. But here are some good facts:
1, an online book of deeplearning
2, Alec.radford video tutorial "Using Python's Theano library for Deeplearning"
3, Andrej Karpathy finishing the exciting deeplearning example
Directory:
1, environmental requirements
2, Data Introduction
3, the first model: a traditional neural network with a hidden layer structure
4, test just the network
5, the second model: convolutional neural Networks
6, Data expansion
7, to change the learning rate and learning potential over time
8, Discard tips (dropout)
9, training Expert network
10, supervised pre-training
The first part: environmental requirements
Note: If you just want to see the tutorial you don't really need to run the code here, but if you have a CUDA-enabled GPU on hand and want to follow the steps of the tutorial to execute the code, here are a few guidelines:
1, let's say your support for Cuda's GPU is already ready, and Python2.7.x,numpy,pandas,matplotlib,scikit-learn is already installed.
2, install a virtualenv, and activate the virtual environment.
3. Install lasagne directly from GitHub, run the following command to install lasagne and the corresponding dependencies:
pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requirements.txt
Now that the environment is installed, you can run routines (mnist recognition) from the src/lasagne/examples/directory of the virtual environment:
cd src/lasagne/examples/python mnist.py
It takes a while to print out a command, because Theano is a compiler that writes in Python and compiles the operations of the matrix into GPU code, so lasagne calls Theano to do some computational conversions before the model is trained, and the conversion builds C code. After the training begins, the following information appears:
Epoch 1 of 500 training loss: 1.352731 validation loss: 0.466565 validation accuracy: 87.70 %Epoch 2 of 500 training loss: 0.591704 validation loss: 0.326680 validation accuracy: 90.64 %Epoch 3 of 500 training loss: 0.464022 validation loss: 0.275699 validation accuracy: 91.98 %...
If you have a GPU and you want Theano to use it, create a ~/.theanorc in your home directory and write down the following configuration in it:
[global]floatX = float32device = gpu0[nvcc]fastmath = True
The above steps have any bug, go to this place to report.
The second part of the score was introduced
For facial KeyPoint Detection This contest, the training set is a grayscale image of 96*96:
The 15 feature points are:
Left and right eye center, 2
Left and right eye lateral point, 2
Medial point of left and right eye, 2
Left and right eyebrows outside the side point, 2
Medial point of left and right eyebrows, 2
Left and right corner of the mouth, 2
Upper and lower Lip Center, 2
Nose tip, 1
An interesting surprise is that the entire sample set has 7,000 training samples for some feature points, but there are only 2000 points for each. Below to read the data:
# file Kfkd.pyimport osimport numpy as npfrom pandas.io.parsers import read_csvfrom sklearn.utils Import shuffleftrain = ' ~/data/kaggle-facial-keypoint-detection/training.csv ' ftest = ' ~/data/kaggle-facial-keypoint-detection/test.csv ' def load (Test=false, Cols=none): "" "Loads data from Ftest if *test* are True, otherwise from Ftrain. Pass a list of *cols* if you ' re is only interested in a subset of thetarget columns. "" " fname = ftest if test else ftraindf = Read_csv (Os.path.expanduser (fname)) # Load Pandas Dataframe # the Image column ha s pixel values separated by space; convert# the values to NumPy arrays:df[' image ' = df[' image '].apply (lambda im:np.fromstring (IM, sep= ")) if cols: # get A subset of columns DF = df[list (cols) + [' Image ']]print (Df.count ()) # Prints the number of values for each COLUMNDF = Df.dropna () # Drop all rows this has missing values in themx = Np.vstack (df[' Image '].values)/255. # scale pixel values to [0, 1]x = X.astype (np.float32) if not ' test: # only FtraiN have any target columns y = df[df.columns[:-1]].values y = (y-48)/$ # scale target coordinates to [-1, 1] X, y = Shuffle (x, Y, random_state=42) # Shuffle train data y = Y.astype (np.float32) else:y = Nonereturn X, YX, y = Load () print ("X.shape = = {}; X.min = = {:. 3f}; X.max = = {:. 3f} ". Format (X.shape, X.min (), X.max ())) print (" Y.shape = = {}; Y.min = = {:. 3f}; Y.max = = {:. 3f} ". Format (Y.shape, Y.min (), Y.max ()))
The read result is this:
$ python kfkd.pyleft_eye_center_x 7034left_eye_center_y 7034right_eye_center_x 7032right_eye_center_y 7032left_eye_inner_corner_x 2266left_eye_inner_corner_y 2266left_eye_outer_corner_x 2263left_eye_outer_corner_y 2263right_eye_inner_corner_x 2264right_eye_inner_corner_y 2264...mouth_right_corner_x 2267mouth_right_corner_y 2267mouth_center_top_lip_x 2272mouth_center_top_lip_y 2272mouth_center_bottom_lip_x 7014mouth_center_bottom_lip_y 7014Image 7044dtype: int64X.shape == (2140, 9216); X.min == 0.000; X.max == 1.000y.shape == (2140, 30); y.min == -0.920; y.max == 0.996
This result tells us that the feature points of many graphs are incomplete, such as the right lip angle, only 2,267 samples. We dropped all the images with less than 15 feature points, and this line did it:
DF = Df.dropna () # Drop all rows this has missing values in them
Train our network with the remaining 2140 pictures as a training set. In this case, the feature (9216) is more than the input sample (2143), and overfitting will be a problem that bothers us.
Another notable point is that in the function of reading data, the image pixel values are scaled from 0~255 to [0,1], and the target value (the location of the feature points) is also reduced from 0~95 to [ -1,1].
adjourned
Using CNN (convolutional neural nets) to detect facial key points tutorial (i)