News classification
Contact us
- Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
- Tel: 13146317170 廖经理
- Fax:
- Email: 398017534@qq.com
Machine Learning Neural Network
Machine Learning Neural Network
Neuron model
The most fundamental component of neural networks is the neuron model. In the biological neural network, each neuron is connected to other neurons. When it is “excited”, it sends chemical substances to the connected neurons, thereby altering the potentials within these neurons; if the potential of a neuron exceeds A "threshold," then it is activated, that is, "excited" and sends chemical materials to other neurons.
Write a picture here
The simple model shown in Fig. 5.1 is the "M-P neuron model" used so far. In this model, the neuron receives input signals from n other neurons. These input signals are stopped by a weighted connection. The acceptance of the neuron into the total input value will be compared with the threshold of the neuron. It is then processed through an "activation function" to produce the neuron's output.
The step function is an ideal activation function that maps the output value to the output value "0" or "1", "1" corresponds to neuron excitation, and "0" corresponds to neuron suppression. However, the step function has a non-consecutive, non-lubricated, and other not very good nature, so the commonly used Sigmoid function as an activation function, as shown below:
Write a picture here
The neural network is obtained by linking many of these neurons in a hierarchical structure.
Perception and multi-layer networks
The Perceptron consists of two layers of neurons, as shown in Figure 5.3. The input layer receives the external input signal and sends it to the output layer. The output layer is an M-P neuron, also known as a "threshold logic unit."
Write a picture here
Given a training data set, the weights wi (i=1, 2,..., N)wi (i=1, 2,..., N) and the threshold θθ can all be learned. We also regard the threshold θθ as a “dummy node” whose fixed input of the neuron is −1.0−1.0, and the corresponding link weight is wn+1 wn+1. In this way, the weights and thresholds are learned. Can be unified as weighted learning (because the practice threshold is −1.0×wn+1−1.0×wn+1).
Perceptron Learning Rules: For exercise examples (x,y) (x,y), if the output of the current perceptron is y^y^, then the perceptron weight will be adjusted as follows:
Wi←wi+Δwi
Wi←wi+Δwi
Δwi=η(y−y^)xi
Δwi=η(y−y^)xi
Among them, η∈(0,1)η∈(0,1) is called the learning rate. Ηη is usually set to a small positive number, for example 0.10.1. When the prediction is correct, the perceptron does not have a seizure change, otherwise the weight will be adjusted based on the wrong level.
If the problem we are dealing with is linearly separable, there will be a linear hyperplane separating them. As shown in Figure 5.4(a)-(c), the perceptron learning process must converge (converge), so as to find the appropriate weight vector w=(w1;w2;...;wn+1)w=(w1;w2;...;wn+1), otherwise the perceptron learning process will attack Fluctation. Figure 5.4(d) nonlinear separability problem.
Write a picture here
To deal with nonlinear separability problems, consider the use of multi-layered functional neurons. As shown in Figure 5.5, the two-layer perceptron deals with XOR problems.
Write a picture here
The layer of neurons between the input layer and the output layer is called the hidden layer or hidden layer. Both the hidden layer and the output layer neurons are function neurons with activation function.
Write a picture here
Our common neural network is the hierarchical structure shown in Figure 5.6. Each layer of neurons is fully interconnected with the next layer of neurons. The neurons of Zhijiang do not exist in the same layer, and there is no cross-layer convergence. Such a neural network configuration is commonly referred to as a "multi-layer feed forward neural network". Since the input layer neurons only accept inputs, the function handling is not stopped, and the hidden and output layers contain functional neurons. Therefore, it is often referred to as "two-tier network" or "single-hidden network."
The learning process of the neural network is to adjust the "connection weight" between neurons and the threshold of each functional neuron based on the exercise data.
Error back propagation algorithm
For a multi-layer neural network to be exercised, the above simple perceptual machine learning rules are obviously not enough, and we need a more powerful learning algorithm. Error Back Propagation (BP) algorithm is an excellent neural network learning algorithm.
There is an exercise set D={(x1,y1),(x2,y2),...,(xm,ym)},xi∈Rd,yi∈RlD={(x1,y1),(x2,y2), ..., (xm, ym)}, xi ∈ Rd, yi ∈ Rl, that is, the input example is depicted by dd attributes, and the ll-dimensional real value vector is output.
Write a picture here
Figure 5.7 is a multi-layer feedforward network structure with dd input neurons, ll output neurons, and qq hidden neurons. The threshold of the jjth neuron in the output layer is denoted by θjθj, the threshold of the hh neuron in the hidden layer is γhγh, and the linking power between the iith neuron in the input layer and the hh neuron in the hidden layer is vihvih. The binding power between the layer hh neurons and the output layer jj neurons is whjwhj; the input of the hidden layer hh neurons is αh=∑di=1xihxiαh=∑i=1dxihxi, the output layer The input accepted by the jj neurons is βj=∑qh=1whjbhβj=∑h=1qwhjbh, and bhbh is the output of the hh neurons in the hidden layer. It is assumed that both the hidden layer and the output layer neurons use the Sigmoid function in Figure 5.2(b).
For exercise examples (xk, yk) (xk, yk), assume that the output of the neural network is yk^=(yk1^, yk2^,...,ykl^)yk^=(y1k^,y2k^,... ,ylk^), ie
Ykj^=f(βj−θj)(5.3)
(5.3)yjk^=f(βj−θj)
, the mean squared error on the network (x_k, y_k) is
Ek=12∑j=1l(ykj^−ykj^)2(5.4)
(5.4) Ek=12∑j=1l(yjk^−yjk^)2
In the network of Fig. 5.7, there are d×q+l×q+q+ld×q+l×q+q+l parameters that need to be confirmed. We use the linking right whjwhj from the hidden layer to the output layer in Figure 5.7 as an example to stop derivation.
The BP algorithm is based on a gradient descent strategy and stops adjusting the parameters in the negative gradient direction of the goal.
The most fundamental component of neural networks is the neuron model. In the biological neural network, each neuron is connected to other neurons. When it is “excited”, it sends chemical substances to the connected neurons, thereby altering the potentials within these neurons; if the potential of a neuron exceeds A "threshold," then it is activated, that is, "excited" and sends chemical materials to other neurons.
Write a picture here
The simple model shown in Fig. 5.1 is the "M-P neuron model" used so far. In this model, the neuron receives input signals from n other neurons. These input signals are stopped by a weighted connection. The acceptance of the neuron into the total input value will be compared with the threshold of the neuron. It is then processed through an "activation function" to produce the neuron's output.
The step function is an ideal activation function that maps the output value to the output value "0" or "1", "1" corresponds to neuron excitation, and "0" corresponds to neuron suppression. However, the step function has a non-consecutive, non-lubricated, and other not very good nature, so the commonly used Sigmoid function as an activation function, as shown below:
Write a picture here
The neural network is obtained by linking many of these neurons in a hierarchical structure.
Perception and multi-layer networks
The Perceptron consists of two layers of neurons, as shown in Figure 5.3. The input layer receives the external input signal and sends it to the output layer. The output layer is an M-P neuron, also known as a "threshold logic unit."
Write a picture here
Given a training data set, the weights wi (i=1, 2,..., N)wi (i=1, 2,..., N) and the threshold θθ can all be learned. We also regard the threshold θθ as a “dummy node” whose fixed input of the neuron is −1.0−1.0, and the corresponding link weight is wn+1 wn+1. In this way, the weights and thresholds are learned. Can be unified as weighted learning (because the practice threshold is −1.0×wn+1−1.0×wn+1).
Perceptron Learning Rules: For exercise examples (x,y) (x,y), if the output of the current perceptron is y^y^, then the perceptron weight will be adjusted as follows:
Wi←wi+Δwi
Wi←wi+Δwi
Δwi=η(y−y^)xi
Δwi=η(y−y^)xi
Among them, η∈(0,1)η∈(0,1) is called the learning rate. Ηη is usually set to a small positive number, for example 0.10.1. When the prediction is correct, the perceptron does not have a seizure change, otherwise the weight will be adjusted based on the wrong level.
If the problem we are dealing with is linearly separable, there will be a linear hyperplane separating them. As shown in Figure 5.4(a)-(c), the perceptron learning process must converge (converge), so as to find the appropriate weight vector w=(w1;w2;...;wn+1)w=(w1;w2;...;wn+1), otherwise the perceptron learning process will attack Fluctation. Figure 5.4(d) nonlinear separability problem.
Write a picture here
To deal with nonlinear separability problems, consider the use of multi-layered functional neurons. As shown in Figure 5.5, the two-layer perceptron deals with XOR problems.
Write a picture here
The layer of neurons between the input layer and the output layer is called the hidden layer or hidden layer. Both the hidden layer and the output layer neurons are function neurons with activation function.
Write a picture here
Our common neural network is the hierarchical structure shown in Figure 5.6. Each layer of neurons is fully interconnected with the next layer of neurons. The neurons of Zhijiang do not exist in the same layer, and there is no cross-layer convergence. Such a neural network configuration is commonly referred to as a "multi-layer feed forward neural network". Since the input layer neurons only accept inputs, the function handling is not stopped, and the hidden and output layers contain functional neurons. Therefore, it is often referred to as "two-tier network" or "single-hidden network."
The learning process of the neural network is to adjust the "connection weight" between neurons and the threshold of each functional neuron based on the exercise data.
Error back propagation algorithm
For a multi-layer neural network to be exercised, the above simple perceptual machine learning rules are obviously not enough, and we need a more powerful learning algorithm. Error Back Propagation (BP) algorithm is an excellent neural network learning algorithm.
There is an exercise set D={(x1,y1),(x2,y2),...,(xm,ym)},xi∈Rd,yi∈RlD={(x1,y1),(x2,y2), ..., (xm, ym)}, xi ∈ Rd, yi ∈ Rl, that is, the input example is depicted by dd attributes, and the ll-dimensional real value vector is output.
Write a picture here
Figure 5.7 is a multi-layer feedforward network structure with dd input neurons, ll output neurons, and qq hidden neurons. The threshold of the jjth neuron in the output layer is denoted by θjθj, the threshold of the hh neuron in the hidden layer is γhγh, and the linking power between the iith neuron in the input layer and the hh neuron in the hidden layer is vihvih. The binding power between the layer hh neurons and the output layer jj neurons is whjwhj; the input of the hidden layer hh neurons is αh=∑di=1xihxiαh=∑i=1dxihxi, the output layer The input accepted by the jj neurons is βj=∑qh=1whjbhβj=∑h=1qwhjbh, and bhbh is the output of the hh neurons in the hidden layer. It is assumed that both the hidden layer and the output layer neurons use the Sigmoid function in Figure 5.2(b).
For exercise examples (xk, yk) (xk, yk), assume that the output of the neural network is yk^=(yk1^, yk2^,...,ykl^)yk^=(y1k^,y2k^,... ,ylk^), ie
Ykj^=f(βj−θj)(5.3)
(5.3)yjk^=f(βj−θj)
, the mean squared error on the network (x_k, y_k) is
Ek=12∑j=1l(ykj^−ykj^)2(5.4)
(5.4) Ek=12∑j=1l(yjk^−yjk^)2
In the network of Fig. 5.7, there are d×q+l×q+q+ld×q+l×q+q+l parameters that need to be confirmed. We use the linking right whjwhj from the hidden layer to the output layer in Figure 5.7 as an example to stop derivation.
The BP algorithm is based on a gradient descent strategy and stops adjusting the parameters in the negative gradient direction of the goal.