News classification
Contact us
- Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
- Tel: 13146317170 廖经理
- Fax:
- Email: 398017534@qq.com
The CNN principle of convolution neural network
The CNN principle of convolution neural network
The convolution neural network (CNN) is an important part of deep learning, due to its excellent learning performance (especially the recognition of pictures). In recent years, the discussion of abnormal fire has shown many models, such as LeNet, Alex net, ZF net and so on. Because most college students use matlab more than others, and online tutorials are basically based on Caffe framework or python, so it is very hard for the new beginners. So this paper uses MATLAB to separate handwritten data from MNIst handwritten database to complete handwritten numeral recognition. My own degree is limited, if there is a mistake, we also look at all the great gods and help to correct them.
The principle of convolution network
1. Motivation
The convolution neural network (CNN) is a variant of the multi-layer perceptron (MLP), which evolved from the concept of biology. From the visual cortex of the cat's early Hubel and Wiesel research work, we know that in the visual cortex are scattered, a complex of cells, these cells on the outside of the input part is very sensitive, they are called the "receptive field" (cell), they are in some way to cover the whole field of vision. Some of these cells as they filter, the input image is part of sensitive, so spatial information can better explore natural images for the purpose of.
In addition, there are two types of cells related to the visual cortex, S cells (Simple Cell) and C (Complex Cell) cells. S cells in the maximally receptive fields itself for similar forms of stimulation in the image edge response, while C cells have more receptive fields, it can stimulate space position of the image in the form of stopping precise positioning.
The visual cortex, as the most powerful visual system known at present, is widely concerned. The academic category presents a number of neural Apocalypse models. For example, NeoCognitron [Fukushima], HMAX [Serre07], and the key LeNet-5 [LeCun98] to be discussed in this tutorial.
2. Dense cohesion
CNNs enhances the spatial part association information of natural images (Local Connectivity Pattern) by enhancing the partial join form between adjacent nodes in the neural network. The nodes of the M hidden layer are part of the subset of the nodes of the M-1 layer, and have the nodes of the spatial continuous visual receptive field (that is, a part of the M-1 layer nodes. The local nodes are adjacent to the M-1 layer). The following diagram can be used to represent this connection.
Here's a picture depiction
It is assumed that the M-1 layer is the input layer of the retina (the natural image bears). According to the above depiction, neurons in the m layer above the M-1 level have 3 receptive fields of width, m layer, and each node converts 3 adjacent nodes in the next retinal layer. The nodes of the m+1 layer have similar join properties to the nodes on the next layer, so the nodes of the m+1 layer are still connected with 3 adjacent nodes in the m layer, but the number of join on the input layer (retinal layer) becomes more and more, and in this picture, it is 5. This structure constructs the corresponding to the input producing the strongest response into a partial form of space (because each upper level node only responds to the lower nodes of the join part in the receptive field). According to the above figure, multi-storey stacking constitutes a filter (no longer linear), and it becomes more global (including a large pixel space). For example, in the above picture, the m+1 layer can stop coding for a nonlinear feature with a width of 5 (in terms of pixel space).
3. Weight sharing
In CNNs, every dense filter hi is superimposed repeatedly in the whole receptive field. These repeated nodes form a feature map (feature map), which can share the same parameters, such as the same weight matrix and bias vector.
Here's a picture depiction
In the above picture, three hidden layer nodes belonging to the same feature graph are restricted to the same color because the requirements share the same weight of the same color. Here, the gradient descent algorithm can still be used to exercise these shared parameters, and only a little change is needed on the root of the original algorithm. The gradient of shared weights can simply sum up the gradient of shared parameters.
The principle of convolution network
1. Motivation
The convolution neural network (CNN) is a variant of the multi-layer perceptron (MLP), which evolved from the concept of biology. From the visual cortex of the cat's early Hubel and Wiesel research work, we know that in the visual cortex are scattered, a complex of cells, these cells on the outside of the input part is very sensitive, they are called the "receptive field" (cell), they are in some way to cover the whole field of vision. Some of these cells as they filter, the input image is part of sensitive, so spatial information can better explore natural images for the purpose of.
In addition, there are two types of cells related to the visual cortex, S cells (Simple Cell) and C (Complex Cell) cells. S cells in the maximally receptive fields itself for similar forms of stimulation in the image edge response, while C cells have more receptive fields, it can stimulate space position of the image in the form of stopping precise positioning.
The visual cortex, as the most powerful visual system known at present, is widely concerned. The academic category presents a number of neural Apocalypse models. For example, NeoCognitron [Fukushima], HMAX [Serre07], and the key LeNet-5 [LeCun98] to be discussed in this tutorial.
2. Dense cohesion
CNNs enhances the spatial part association information of natural images (Local Connectivity Pattern) by enhancing the partial join form between adjacent nodes in the neural network. The nodes of the M hidden layer are part of the subset of the nodes of the M-1 layer, and have the nodes of the spatial continuous visual receptive field (that is, a part of the M-1 layer nodes. The local nodes are adjacent to the M-1 layer). The following diagram can be used to represent this connection.
Here's a picture depiction
It is assumed that the M-1 layer is the input layer of the retina (the natural image bears). According to the above depiction, neurons in the m layer above the M-1 level have 3 receptive fields of width, m layer, and each node converts 3 adjacent nodes in the next retinal layer. The nodes of the m+1 layer have similar join properties to the nodes on the next layer, so the nodes of the m+1 layer are still connected with 3 adjacent nodes in the m layer, but the number of join on the input layer (retinal layer) becomes more and more, and in this picture, it is 5. This structure constructs the corresponding to the input producing the strongest response into a partial form of space (because each upper level node only responds to the lower nodes of the join part in the receptive field). According to the above figure, multi-storey stacking constitutes a filter (no longer linear), and it becomes more global (including a large pixel space). For example, in the above picture, the m+1 layer can stop coding for a nonlinear feature with a width of 5 (in terms of pixel space).
3. Weight sharing
In CNNs, every dense filter hi is superimposed repeatedly in the whole receptive field. These repeated nodes form a feature map (feature map), which can share the same parameters, such as the same weight matrix and bias vector.
Here's a picture depiction
In the above picture, three hidden layer nodes belonging to the same feature graph are restricted to the same color because the requirements share the same weight of the same color. Here, the gradient descent algorithm can still be used to exercise these shared parameters, and only a little change is needed on the root of the original algorithm. The gradient of shared weights can simply sum up the gradient of shared parameters.