News classification
Contact us
- Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
- Tel: 13146317170 廖经理
- Fax:
- Email: 398017534@qq.com
Neural network of machine learning
Neural network of machine learning
One, neuron model
The research on neural networks has long been presented. The usual neural networks used to be a fairly large subject category and have many definitions. The definition used in the book "machine learning" by Zhou Zhihua teachers is that "neural networks are universal parallel interconnections consisting of simple units of compliance." Its network can mimic the interactive response of biological neural system to real world objects ".
The most fundamental component of the neural network is the neuron (neural) model, that is, the "simple cell" in the above definition. In a biological network, each neuron is connected to other neurons, and when it "excuses" (the neuron's potential exceeds a certain "threshold"), it sends chemical substances to the connected neurons, which changes the potential in these neurons. The above is a comparison between biological neurons and neuron models. Is it very similar?
Source: Machine Learning - Zhou Zhihua
The above picture is the "M-P neural network" neuron model. In the above picture, the neuron receives input signals from the other n neurons, and the input signals pass through a weighted connection.
At present, the typical activation functions have step functions and Sigmoid functions.
So the question is, why should we participate in the activation function?
Because if we do not participate in the activation function or use the identity function as an activation function, the final obtained function must be a linear function, that is to say, it can only deal with simple linear separable problems, its limitations are very obvious. Ideal is that we have many nonlinear problems. If we do not introduce activation functions, we can not complete simple XOR logic. Introducing the activation function is to participate in the nonlinear influence factor in the model, so that it has the ability to deal with nonlinear problems. The detailed deduction is no longer depicted here.
Then, the question comes again. Why is it that the neural network is able to approach the arbitrary complex continuous functions after adding the hidden neurons to the activation function?
The neural network is obtained by linking many of the above-mentioned neurons to a certain level.
Two, perceptron and multi-layer network
Perceptron (Perceptron) is made up of two neurons. The input layer receives the input signal from the outside and sends it to the output layer.
As long as the output layer is M-P neuron, also called "threshold logic unite" (threshold logic unit), the input layer neurons only transfer the data to the output layer through the connection of weights. Perceptron can easily perform logical and / or non operation. It is assumed that the activation function used in the output layer is a step function. So there are:
The logic of "and" is only 1 when two inputs are 1.
"Or" logic, when an input is 1, the output is 1:
"Non" logic, when input is 0, output 1, input is 1, output 0:
After understanding the perceptron, the next step is how to calculate the parameters, that is, how the algorithm "learn". For a given set of exercise sets, the weights and thresholds can be learned, and the threshold can be seen as a fixed input of -1.0. I think it can also be a "dummy node" of 1, giving it a weight so that the threshold can be stopped learning with the same weight. The learning rules of the perceptron are simple:
However, as long as there is an activation function in the output layer neuron, that is, as long as a layer of functional neural, it can only deal with the linear separable problem (there is a linear hyperplane can be separated), but it can not deal with the nonlinear separable questions, even the simple nonlinear separable problem. It can't be dealt with.
To deal with the nonlinear separable problem, we need to use multilayer functional neurons (perceptron as output layer only). Each neuron is interconnected with the next layer of neurons, and there is no interlayer connection between neurons and there is no cross layer cohesion. Such a neural network structure is called a "multilayer feedforward neural network".
The input layer neurons only accept the external input, and there are several layers between the input layer and the output layer, which are called "hidden layers". In addition to the inputted neurons, the neurons all have the ability to dispose of the signals, that is, the functional neurons. The neural network with only one layer of hidden layer is called the "single hidden layer network". It can be called a multilayer network only with the hidden layer. The learning process of the neural network is to adjust the "connection right" and the threshold value of each functional neuron from time to time, according to the exercise data set.
Three, error inverse propagation algorithm
The learning ability of multilayer networks is much stronger than that of single layer perceptrons. In order to exercise multi-layer networks, it is obviously not enough to rely on the learning rules of the previous perceptron to find an algorithm that is better than the previous learning rules. Error BackPropagation (BP) algorithm is an excellent algorithm, and the BP algorithm can not only be used in the multilayer feedforward neural network, but also can be used in other types of neural networks, but usually "BP network" refers to the multi-layer feedforward neural network which is exercised with BP algorithm.
With regard to the BP algorithm, the main idea is that the given exercise set D={(x1, Y1), (X2, Y2), (XM, YM)}, the input x is described by D attributes, and the output y is a l dimension vector.
The research on neural networks has long been presented. The usual neural networks used to be a fairly large subject category and have many definitions. The definition used in the book "machine learning" by Zhou Zhihua teachers is that "neural networks are universal parallel interconnections consisting of simple units of compliance." Its network can mimic the interactive response of biological neural system to real world objects ".
The most fundamental component of the neural network is the neuron (neural) model, that is, the "simple cell" in the above definition. In a biological network, each neuron is connected to other neurons, and when it "excuses" (the neuron's potential exceeds a certain "threshold"), it sends chemical substances to the connected neurons, which changes the potential in these neurons. The above is a comparison between biological neurons and neuron models. Is it very similar?
Source: Machine Learning - Zhou Zhihua
The above picture is the "M-P neural network" neuron model. In the above picture, the neuron receives input signals from the other n neurons, and the input signals pass through a weighted connection.
At present, the typical activation functions have step functions and Sigmoid functions.
So the question is, why should we participate in the activation function?
Because if we do not participate in the activation function or use the identity function as an activation function, the final obtained function must be a linear function, that is to say, it can only deal with simple linear separable problems, its limitations are very obvious. Ideal is that we have many nonlinear problems. If we do not introduce activation functions, we can not complete simple XOR logic. Introducing the activation function is to participate in the nonlinear influence factor in the model, so that it has the ability to deal with nonlinear problems. The detailed deduction is no longer depicted here.
Then, the question comes again. Why is it that the neural network is able to approach the arbitrary complex continuous functions after adding the hidden neurons to the activation function?
The neural network is obtained by linking many of the above-mentioned neurons to a certain level.
Two, perceptron and multi-layer network
Perceptron (Perceptron) is made up of two neurons. The input layer receives the input signal from the outside and sends it to the output layer.
As long as the output layer is M-P neuron, also called "threshold logic unite" (threshold logic unit), the input layer neurons only transfer the data to the output layer through the connection of weights. Perceptron can easily perform logical and / or non operation. It is assumed that the activation function used in the output layer is a step function. So there are:
The logic of "and" is only 1 when two inputs are 1.
"Or" logic, when an input is 1, the output is 1:
"Non" logic, when input is 0, output 1, input is 1, output 0:
After understanding the perceptron, the next step is how to calculate the parameters, that is, how the algorithm "learn". For a given set of exercise sets, the weights and thresholds can be learned, and the threshold can be seen as a fixed input of -1.0. I think it can also be a "dummy node" of 1, giving it a weight so that the threshold can be stopped learning with the same weight. The learning rules of the perceptron are simple:
However, as long as there is an activation function in the output layer neuron, that is, as long as a layer of functional neural, it can only deal with the linear separable problem (there is a linear hyperplane can be separated), but it can not deal with the nonlinear separable questions, even the simple nonlinear separable problem. It can't be dealt with.
To deal with the nonlinear separable problem, we need to use multilayer functional neurons (perceptron as output layer only). Each neuron is interconnected with the next layer of neurons, and there is no interlayer connection between neurons and there is no cross layer cohesion. Such a neural network structure is called a "multilayer feedforward neural network".
The input layer neurons only accept the external input, and there are several layers between the input layer and the output layer, which are called "hidden layers". In addition to the inputted neurons, the neurons all have the ability to dispose of the signals, that is, the functional neurons. The neural network with only one layer of hidden layer is called the "single hidden layer network". It can be called a multilayer network only with the hidden layer. The learning process of the neural network is to adjust the "connection right" and the threshold value of each functional neuron from time to time, according to the exercise data set.
Three, error inverse propagation algorithm
The learning ability of multilayer networks is much stronger than that of single layer perceptrons. In order to exercise multi-layer networks, it is obviously not enough to rely on the learning rules of the previous perceptron to find an algorithm that is better than the previous learning rules. Error BackPropagation (BP) algorithm is an excellent algorithm, and the BP algorithm can not only be used in the multilayer feedforward neural network, but also can be used in other types of neural networks, but usually "BP network" refers to the multi-layer feedforward neural network which is exercised with BP algorithm.
With regard to the BP algorithm, the main idea is that the given exercise set D={(x1, Y1), (X2, Y2), (XM, YM)}, the input x is described by D attributes, and the output y is a l dimension vector.