1 Figure Neural Network (original version)
Figure Neural Network now the power and the use of the more slowly I have seen from the most original and now slowly the latest paper constantly write my views and insights I was born in mathematics, so I prefer the mathematical deduction of the first article on the introduction of the idea of neural Network Diagram Neural network model is based on this slowly improved.
2 fields to be dealt with
For common traveller issues The common unstructured data, such as the molecular structure of social networks, can be handled differently depending on how your g (x) is the output function, the contribution of the graph neural model is how to learn and characterize a non-structural data
3 Model 3.1 Introduction
First of all, for the graph, there are two classes of information, one for the graph node information, the node of the graph is the "state" of a node, and we use X (i) to represent the "state" of the I-node, which is the characterization of the graph information after the model has been studied. We can intuitively assume that the "state" The label of the surrounding node and the edge of the node (which can be thought of as the distance between them) and then we use a function f to learn so we can get the following
The work we have to do is to learn the "state" of each node in the graph, but we will find a problem that is that the "state" X (i) of the I node depends on the J node's state X (j) the same as the J Point and the two are constantly interdependent to form a cycle. The hypothesis of the model is that we can solve the "state" of the whole graph by iterative iteration.
3.2
We introduce our output function g (x,l) that is, the output of a node is related to the "state" of this point and the connected edge
Thus we got two functions of our whole model one to go to the state of the graph to go out of the output (according to the actual task)
The key to our solution is how the F function solves the state of the whole graph. Mathematically there is a theory that the derivative of f-X is less than 1 o'clock, we can guarantee the convergence
The iterative process is simply to update the state of the t+1 wheel with the state of the T-wheel, and finally get the state of the whole graph convergence.
3.3
So now we can turn the solution process into a round where both G and F are neural network structures you can design for yourself we will expand the entire solution into the following form, which is preceded by the iterative process of solving the state of the solution diagram.
3.4
The general idea of the graph neural model is introduced. Next I will introduce the gradient descent and the derivation process because the graph neural network needs to ensure the state convergence in the process of solving the next step of the iterative process is different
First, let's introduce the existence of a hidden function.
This function reflects the distance between the state x we really need and the X (t) we are now asking for the T-wheel, which can theoretically prove the existence of a parameter w that allows us to solve the perfect X, and then associate the parameter W with X.
3.5
Then in introducing our loss function e This function how to define loss with your output function g has a close relationship need your own design again not to be described. According to the model expansion structure, we get the following derivation formula
The rule of derivation for passing time is very close to the traditional rnn and I am no longer tired of the statement. According to our hypothesis, after a certain number of iterations Z (t) equals Z (t+1)
According to (8) Get (9) We again according to the implicit function that we previously proved to be present, and according to the law of the implicit function derivation (10), (11) to solve the derivative of the other direction is not according to the expansion of our model, directly according to the definition of the partial derivation of the result will be (9) (10) (
All of the derivative formulas are converted to the derivation of the parameter W. Then an iterative process of Z (T) is analogous to the summation of a series
So far we've got all of our derivation laws.
3.6 Model algorithm
In the process of derivation we are assuming convergence to a value, we can use our derivation of the formula for derivation so the algorithm we need to add two steps to verify convergence and then continue the derivation of the overall algorithm block diagram is as follows
4 Summary
The contribution of the overall model is to solve how to learn the characteristics of a non-structural data, the use of iterative to convergent value of the method to learn, perhaps we also found that the link between the two points of the model is not how to focus on the current figure of the image of the idea similar but joined the side of the study.
5 Questions with own thoughts
First 1) This model in the calculation process to ensure that the derivative of the F-X is less than 1 this will make the model can not deepen the layer of a high bound to appear gradient disappear and so on
2) There is no effective way to learn the side of the information
Figure Neural Networks the graph neural network model