News classification
Contact us
- Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
- Tel: 13146317170 廖经理
- Fax:
- Email: 398017534@qq.com
Deep learning neural network: multi-layer network and C++ implementation
Deep learning neural network: multi-layer network and C++ implementation
First, introduction
In the previous article about neural network, we gave the construction and function principle of single neuron in neural network, and deduced the weight updating rule of single SIMGOID cell by gradient descent. In the end we give an example, with the characteristics of a 4 dimensional unit vector, mapped to one-dimensional [0,1] space, we use a sensor unit, the experimental results indicated that after 15000 times (once the convergence in practice should be about 5000 times) after exercise on characteristic vector is given the sensor unit can always get very close to our expected results. But in practice, the mapping relationship between individual neurons can't fit too complicated, we need to build more complex network approach more complex objective function, at the end of this paper, we will use the multi-layer network disposal in the last article example, after 300-500 times of exercise, can be a good convergence.
This article is the second article on the topic of neural network. It mainly introduces the construction of multilayer network and the update of weight by backpropagation algorithm. At last, we will finish the whole structure with C++ step by step.
Two and multi-layer network construction
Multilayer network, network system definition is composed of multiple layers, its each layer is composed of a plurality of neuron nodes, a node in the arbitrary layer with each node connected to the upper layer, by them to provide input, through the calculation of the output node and the node as the input of the next layer.
It is worth noting that any multilayer network construction must have an input layer and an output layer. The following graph construction is a more image representation:
Image
We re illustrate the construction of multilayer network from image. The above is a 3 layer network structure. It consists of an input layer, an output layer and a hidden layer. Of course, there are more layers of hidden layer. Image hiding layer node II and the input layer of each node is connected, that is to say it accepted a set of vectors input=[x1, X2, X3, xn]input=[x1, X2, X3., xn]., N, as the input lines connected to it on behalf of the N input weights. It is particularly important to note that there is a red connection between the node of the image hiding layer and the output node, which represents the bias, that is, w0w0.
After separating the contents of the previous article, we know the I node of the image. It stops the input and the corresponding weights from the linear weighted sum, and then calculates the output of the node through the sigmoid function sigmoid function.
Neti=w0+x1w1+x2w2+x3w3+xnwn= Sigma i=0N=nxiwi. (x0=1) neti=w0+x1w1+x2w2+x3w3+xnwn= Sigma i=0N=nxiwi. (where x0=1)
Oi=11+e - netioi=11+e - Neti
The entire multilayer network is a set of input starts, and then the weight of each link, stop the continuous forward calculation. Here we further on this network structure in order to quantify, behind: firstly, it is a three layer network structure, the first layer is the input layer, it does not accept the input, there is no connection in; second layer has 3 nodes, and 3* (4+1) Gen Lian junction line, pay attention to each node a bias line; the last layer is the output layer, it has 4 nodes, and 4* (3+1) root link. So the whole network is constructed as [layer1, layer2, layer3][layer1, layer2 and layer3], and each layer is constructed like layer=[nodes, weights]layer=[nodes and weights].
Three. Backpropagation algorithm
In the last article, we discussed a gradient descent method for a single node to update the weight vector. For multi-layer network construction, we can also use the similar method to deduce the updating rule of the whole network weight. We call this method the back propagation algorithm, because it updates the weight from the beginning of the output layer.
So, let's think about the output layer, or do we think about the whole output error first? Unlike a single perceptron unit (one output), a multilayer network has multiple outputs, its error calculation formula can be expressed by LSM rule as follows:
E (W) =12 D, D K, sigma sigma outputs (TKD - okd) 2E (W) =12 D, D K, sigma sigma outputs (TKD - okd) 2
Among them, outputsoutputs is the confluence of network output units. DD represents all exercise sample spaces. Tkdtkd and okdokd are expected values and output values related to exercise samples DD and KK output units.
Our goal is to search for a large presumed space, which is made up of all possible weights in the network construction. If we consider the definition of geometry, this huge search space constitutes an error surface. We need to find the minimum point on this surface. Obviously, gradient descent is one of our ways. We will calculate the gradient direction of any point on the surface and then change the weight along the opposite direction, which will make the error smaller.
In response to the random gradient descent rule in one of our articles, we apply it to the backpropagation algorithm of multi-layer networks. We only deal with one sample case at a time, then update each weight, and gradually adjust the weight through a large number of sample examples. So on every exercise sample DD, its output error is:
Ed (W) =12 and outputs (TKD - K okd) 2Ed (W) =12 and outputs (TKD - K okd) 2
As for the connection weights on the nodes of the output layer, it is obvious that they can directly affect the final error, while the weight of the tie line on the hidden layer node can only indirectly affect the final result, so we can derive the backpropagation in two situations.
In the previous article about neural network, we gave the construction and function principle of single neuron in neural network, and deduced the weight updating rule of single SIMGOID cell by gradient descent. In the end we give an example, with the characteristics of a 4 dimensional unit vector, mapped to one-dimensional [0,1] space, we use a sensor unit, the experimental results indicated that after 15000 times (once the convergence in practice should be about 5000 times) after exercise on characteristic vector is given the sensor unit can always get very close to our expected results. But in practice, the mapping relationship between individual neurons can't fit too complicated, we need to build more complex network approach more complex objective function, at the end of this paper, we will use the multi-layer network disposal in the last article example, after 300-500 times of exercise, can be a good convergence.
This article is the second article on the topic of neural network. It mainly introduces the construction of multilayer network and the update of weight by backpropagation algorithm. At last, we will finish the whole structure with C++ step by step.
Two and multi-layer network construction
Multilayer network, network system definition is composed of multiple layers, its each layer is composed of a plurality of neuron nodes, a node in the arbitrary layer with each node connected to the upper layer, by them to provide input, through the calculation of the output node and the node as the input of the next layer.
It is worth noting that any multilayer network construction must have an input layer and an output layer. The following graph construction is a more image representation:
Image
We re illustrate the construction of multilayer network from image. The above is a 3 layer network structure. It consists of an input layer, an output layer and a hidden layer. Of course, there are more layers of hidden layer. Image hiding layer node II and the input layer of each node is connected, that is to say it accepted a set of vectors input=[x1, X2, X3, xn]input=[x1, X2, X3., xn]., N, as the input lines connected to it on behalf of the N input weights. It is particularly important to note that there is a red connection between the node of the image hiding layer and the output node, which represents the bias, that is, w0w0.
After separating the contents of the previous article, we know the I node of the image. It stops the input and the corresponding weights from the linear weighted sum, and then calculates the output of the node through the sigmoid function sigmoid function.
Neti=w0+x1w1+x2w2+x3w3+xnwn= Sigma i=0N=nxiwi. (x0=1) neti=w0+x1w1+x2w2+x3w3+xnwn= Sigma i=0N=nxiwi. (where x0=1)
Oi=11+e - netioi=11+e - Neti
The entire multilayer network is a set of input starts, and then the weight of each link, stop the continuous forward calculation. Here we further on this network structure in order to quantify, behind: firstly, it is a three layer network structure, the first layer is the input layer, it does not accept the input, there is no connection in; second layer has 3 nodes, and 3* (4+1) Gen Lian junction line, pay attention to each node a bias line; the last layer is the output layer, it has 4 nodes, and 4* (3+1) root link. So the whole network is constructed as [layer1, layer2, layer3][layer1, layer2 and layer3], and each layer is constructed like layer=[nodes, weights]layer=[nodes and weights].
Three. Backpropagation algorithm
In the last article, we discussed a gradient descent method for a single node to update the weight vector. For multi-layer network construction, we can also use the similar method to deduce the updating rule of the whole network weight. We call this method the back propagation algorithm, because it updates the weight from the beginning of the output layer.
So, let's think about the output layer, or do we think about the whole output error first? Unlike a single perceptron unit (one output), a multilayer network has multiple outputs, its error calculation formula can be expressed by LSM rule as follows:
E (W) =12 D, D K, sigma sigma outputs (TKD - okd) 2E (W) =12 D, D K, sigma sigma outputs (TKD - okd) 2
Among them, outputsoutputs is the confluence of network output units. DD represents all exercise sample spaces. Tkdtkd and okdokd are expected values and output values related to exercise samples DD and KK output units.
Our goal is to search for a large presumed space, which is made up of all possible weights in the network construction. If we consider the definition of geometry, this huge search space constitutes an error surface. We need to find the minimum point on this surface. Obviously, gradient descent is one of our ways. We will calculate the gradient direction of any point on the surface and then change the weight along the opposite direction, which will make the error smaller.
In response to the random gradient descent rule in one of our articles, we apply it to the backpropagation algorithm of multi-layer networks. We only deal with one sample case at a time, then update each weight, and gradually adjust the weight through a large number of sample examples. So on every exercise sample DD, its output error is:
Ed (W) =12 and outputs (TKD - K okd) 2Ed (W) =12 and outputs (TKD - K okd) 2
As for the connection weights on the nodes of the output layer, it is obvious that they can directly affect the final error, while the weight of the tie line on the hidden layer node can only indirectly affect the final result, so we can derive the backpropagation in two situations.