Radial basis networks (RBF network)

Source: Internet
Author: User
Tags dashed line scalar

Source: http://blog.csdn.net/zouxy09/article/details/13297881

1. Radial basis function

The radial basis function (Radical Basis FUNCTION,RBF) method was proposed by Powell in 1985. The so-called radial basis function, in fact, is some kind of scalar function along radial symmetry. A monotone function, usually defined as a Euclidean distance between any point in space X to a central C, can be recorded as K (| | x-c| |), the effect is often local, that is, when the x away from C function value is very small. For example, a Gaussian radial basis function:

The birth of radial basis function is mainly to solve the problem of multi-variable interpolation. You can look at the diagram below. In particular, a base function is placed on each sample, and each blue point in the figure is a sample, and then the green dashed line in the middle of the graph corresponds to a Gaussian function for each training sample (the center of the Gaussian function is the sample point). Then assuming that the curve of the actual fitting of these training data is the blue one (the right-most figure), if we have a new data x1, we want to know how much it corresponds to F (x1), which is the ordinate of point A. Then the graph can see that the ordinate of point A is equal to the ordinate of the B point plus the ordinate of the C point. While the ordinate of B is the value of the Gaussian function of the first sample point multiplied by a large point of merit, the ordinate of C is the value of the Gaussian function of the second sample point multiplied by the right of the other small point to be worth it. The weights of the other sample points are all 0, because we want to interpolate the point X1 between the first and second sample points, away from the other sample points, then the most important effect of interpolation is close to the point, away from the far no contribution. So the function values of the X1 point are determined by the nearby B and C points. Expand to any new X, these red Gaussian functions multiplied by a weight and then added to the corresponding X place, you can perfectly fit the real function curve.

Two, radial basis network

By the year 1988, Moody and Darken proposed a neural network structure, RBF neural network, which belongs to the forward neural network type, which can approximate any continuous function with arbitrary precision, which is especially suitable for solving the classification problem.

The structure of RBF network is similar to multilayer Feedforward network, it is a three-layer forward network. The input layer is composed of the signal source node, the second layer is the hidden layer, and the hidden element number is determined by the need of the described problem, and the transform function of the hidden element is the RBF radial basis function, which is a non-negative nonlinear function of radial symmetry and attenuation of the center point; The transformation from the space of the human to the hidden layer is nonlinear, and the spatial transformation from the hidden layer to the output layer is linear.

The basic idea of RBF network is: using RBF as the "base" of hidden elements to form the hidden layer space, so that the input vectors can be mapped directly to the hidden space (i.e., no need to pass through the right connection). According to the cover theorem, the non-divided data in the low dimensional space is more likely to become available in the high dimensional space. In other words, the function of the hidden layer of RBF network is to map the input of low dimension space to a high dimensional space by nonlinear function. The curve is then fitted in this high-dimensional space. It is equivalent to looking for a surface that can best fit the training data in an implicit high-dimensional space. This is different from the normal multilayer perceptron MLP.

When the center point of the RBF is determined, the mapping relationship is determined. The mapping of the hidden layer space to the output space is linear, that is, the output of the network is the linear weighting of the output of the hidden unit, and the right here is the network tunable parameter. Thus, from the overall view, the network from the output of the mapping is non-linear, and the network output on the tunable parameters are linear. Thus, the power of the network can be solved directly by the linear equations, which greatly accelerates the learning speed and avoids the local minimum problem.

It can be understood from another aspect that the implicit node basis function of multilayer perceptron (including BP neural network) is linear, and the activation function uses the sigmoid function or the hard limit function. The base function of the hidden node of the RBF network takes a distance function (such as Euclidean distance) and uses a radial basis function (such as the Gaussian function) as the activation function. Radial basis functions a central point of the n-dimensional space has radial symmetry, and the farther the neuron's input is from the center point, the less the neuron activates. This characteristic of hidden nodes is often referred to as "local characteristics".

Design and solution of RBF Network

The design of RBF mainly consists of two aspects, one is the structure design, that is to say, the hidden layer contains several nodes suitable. The other is the parameter design, which is to solve the network parameters. From the above input to output network mapping function formula can be seen, the network parameters mainly include three kinds: radial basis functions of the center, variance and hidden layer to the output layer weights. So far, there have been many methods to solve these three kinds of parameters, mainly can be divided into the following two major categories:

1. Method One:

The center and variance of radial basis functions are obtained by unsupervised method, and the weights of hidden layers to the output layer are obtained through the supervised method (minimum mean square error). Specific as follows:

(1) Randomly select H samples as the center of the H radial basis function in the training sample set. A better approach is to use clustering, such as K-means clustering, to get an H cluster center, which is the center of H for radial basis functions.

(2) When the base function of the RBF neural network is a Gaussian function, the variance can be solved by the following formula:

The Cmax is the maximum distance between the selected centers, and H is the number of hidden layer nodes. Extended constants are calculated so that the radial basis function is not too sharp or peaceful.

(3) The connection weights of the neurons between the hidden layer and the output layer can be calculated directly with the minimum mean square error LMS, the formula is as follows: (Calculate pseudo-inverse) (d is the output value we expect)

2, Method Two:

The supervised learning algorithm is used to train all parameters of the network (the center of the radial basis function, the variance and the weight of the hidden layer to the output layer). The main focus is on the cost function (mean square error) gradient descent, and then modify each parameter. Specific as follows:

(1) The weights of the center, variance and hidden layer to the output layer of the radial basis function are randomly initialized. Of course, you can also use method one (1) to initialize the center of the radial basis function.

(2) through the gradient descent to the network of three parameters are supervised training optimization. The cost function is the mean square error of the network output and the desired output:

Then each iteration, in the negative direction of the error gradient has a certain learning rate adjustment parameters.

Four, the code implementation:

1. The first method

The first method in Zhangchaoyang's blog has a C + + implementation, but the above is for scalar data (both input and output are one-dimensional). And in MATLAB also provides the first method of the improved version (hehe, personally think, you can run in MATLAB open NEWRB view the source code).

One of the functions provided by MATLAB is NEWRB (). It has the ability to automatically increase the number of hidden neurons in the network until the mean variance satisfies our required precision or the number of neurons reaches the maximum (that is, the number of samples we provide, when the number of neurons is the same as our sample number, the RBF network now has a mean square error of 0). It can also be used in a simple way:

RBF = NEWRB (train_x, train_y);

Output = RBF (test_x);

Just give the training sample to it to get a RBF network. Then we give it the input to get the output of the network.

2. The second method

The second method on Zhangchaoyang's blog also has a C + + implementation, but the above is for scalar data (both input and output are one-dimensional). But I do the image, the network needs to accept high-dimensional input, and in MATLAB, the operation of the vector is much faster than for training. So I wrote a version of it that could accept vector input and vector output through a BP algorithm to supervise training. The BP algorithm can be referenced here: Backpropagationalgorithm, the main calculation of each layer of each node residuals on it. In addition, my code can be checked by gradient, but in some training sets, the value of the cost function will rise with the number of iterations, which is very strange, and then reduce the learning rate or the same. But in some simple points of the training set can still work, although the training error is also very large (not fully fit training samples). So if you find that the code inside the wrong part, but also hope you tell the next.

The main code is shown below:

learnrbf.m

[CPP] view plain copy

  1. %//This was a RBF network trained by BP algorithm

  2. %//Author:zouxy

  3. %//date:2013-10-28

  4. %//homepage:http://blog.csdn.net/zouxy09

  5. %//email: [Email protected]

  6. Close all; Clear CLC

  7. %%% ************************************************

  8. %%% ************ Step 0:load data ****************

  9. Display (' Step 0:load data ... ');

  10. % train_x = [1 2 3 4 5 6 7 8]; % each sample arranged as a column of train_x

  11. % train_y = 2 * train_x;

  12. train_x = rand (5, 10);

  13. train_y = 2 * train_x;

  14. test_x = train_x;

  15. test_y = train_y;

  16. Percent from Matlab

  17. % RBF = NEWRB (train_x, train_y);

  18. % output = RBF (test_x);

  19. %%% ************************************************

  20. %%% ******** Step 1:initialize parameters ********

  21. Display (' step 1:initialize parameters ... ');

  22. NumSamples = Size (train_x, 2);

  23. rbf.inputsize = Size (train_x, 1);

  24. Rbf.hiddensize = NumSamples; % num of Radial Basis function

  25. rbf.outputsize = Size (train_y, 1);

  26. Rbf.alpha = 0.1; % learning rate (should not being large!)

  27. Percent Centre of RBF

  28. For i = 1:rbf.hiddensize

  29. % randomly pick up some samples to initialize centres of RBF

  30. index = Randi ([1, numsamples]);

  31. Rbf.center (:, i) = train_x (:, index);

  32. End

  33. Percent Delta of RBF

  34. Rbf.delta = rand (1, rbf.hiddensize);

  35. Percent Weight of RBF

  36. R = 1.0; % random number between [-R, R]

  37. Rbf.weight = rand (rbf.outputsize, rbf.hiddensize) * 2 * R-R;

  38. %%% ************************************************

  39. %%% ************ Step 2:start Training ************

  40. Display (' Step 2:start training ... ');

  41. Maxiter = 400;

  42. Precost = 0;

  43. For i = 1:maxiter

  44. fprintf (1, ' iteration%d, ', i);

  45. RBF = TRAINRBF (RBF, train_x, train_y);

  46. fprintf (1, ' The cost is %d \ n ', rbf.cost);

  47. Curcost = Rbf.cost;

  48. if ABS (Curcost-precost) < 1e-8

  49. Disp (' reached iteration termination condition and termination now! ');

  50. Break ;

  51. End

  52. Precost = Curcost;

  53. End

  54. %%% ************************************************

  55. %%% ************ Step 3:start testing ************

  56. Display (' Step 3:start testing ... ');

  57. Green = Zeros (rbf.hiddensize, 1);

  58. For i = 1:size (test_x, 2)

  59. for j = 1:rbf.hiddensize

  60. Green (j, 1) = Green (test_x (:, i), Rbf.center (:, J), Rbf.delta (j));

  61. End

  62. Output (:, i) = Rbf.weight * Green;

  63. End

  64. Disp (test_y);

  65. Disp (output);

trainrbf.m

[CPP] view plain copy

  1. function [RBF] = TRAINRBF (RBF, train_x, train_y)

  2. %%% Step 1:calculate Gradient

  3. NumSamples = Size (train_x, 2);

  4. Green = Zeros (rbf.hiddensize, 1);

  5. Output = Zeros (rbf.outputsize, 1);

  6. Delta_weight = Zeros (rbf.outputsize, rbf.hiddensize);

  7. Delta_center = Zeros (rbf.inputsize, rbf.hiddensize);

  8. Delta_delta = Zeros (1, rbf.hiddensize);

  9. Rbf.cost = 0;

  10. for i = 1:numsamples

  11. Percent Feed forward

  12. for j = 1:rbf.hiddensize

  13. Green (j, 1) = Green (train_x (:, i), Rbf.center (:, J), Rbf.delta (j));

  14. End

  15. Output = Rbf.weight * Green;

  16. Percent back propagation

  17. DELTA3 =-(train_y (:, i)-output);

  18. Rbf.cost = rbf.cost + sum (delta3.^2);

  19. Delta_weight = delta_weight + delta3 * Green ';

  20. DELTA2 = Rbf.weight ' * delta3. * GREEN;

  21. for j = 1:rbf.hiddensize

  22. Delta_center (:, j) = Delta_center (:, J) + Delta2 (j). * (train_x (:, i)-Rbf.center (:, J))./Rbf.delta (j) ^2;

  23. Delta_delta (j) = Delta_delta (j) + DELTA2 (j) * SUM ((train_x (:, i)-Rbf.center (:, J)). ^2)./Rbf.delta (j) ^3;

  24. End

  25. End

  26. %%% Step 2:update Parameters

  27. Rbf.cost = 0.5 * rbf.cost./numsamples;

  28. Rbf.weight = rbf.weight-rbf.alpha. * delta_weight./numsamples;

  29. Rbf.center = rbf.center-rbf.alpha. * delta_center./numsamples;

  30. Rbf.delta = rbf.delta-rbf.alpha. * Delta_delta./numsamples;

  31. End

Green.m

[plain] view plain copy

    1. function Greenvalue = green (x, C, Delta)

    2. Greenvalue = exp ( -1.0 * SUM ((X-C). ^2)/(2 * delta^2));

    3. End

Five, code testing

First, I tested one-dimensional input, and the function to fit is simple, y=2x.

train_x = [1 2 3 4 5 6 7 8];

train_y = 2 * train_x;

So the expected output is:

2 4 6 8 10 12 14 16

My Code training iterations 200 times after the network output is:

2.0042 4.0239 5.9250 8.0214 10.0692 11.9351 14.0179 15.9958

The output of the MATLAB NEWRB is:

2.0000 4.0000 6.0000 8.0000 10.0000 12.0000 14.0000 16.0000

As you can see, MATLAB is perfect to fit AH. My one or the mean square error is still quite large.

Then I tested the high-dimensional input, and the training sample was obtained by using the MATLAB rand (5, 10), which generated a random number between 5 rows and 10 columns [0 1]. That means our sample is 10, and the dimensions of each sample are 5 dimensions. We are also testing a very simple function y=2x. The results are as follows:

I'm not going to say anything about the result. Expect everyone to find the code inside the wrong place, and then tell under, thank you very much.

The above are copied over, Bo master provides MATLAB code can also improve, but it is a good learning article.

Radial basis networks (RBF network)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.