Problems often encountered in deep learning

News classification

Contact us

Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
Tel: 13146317170 廖经理
Fax:
Email: 398017534@qq.com

1. How to reduce the over fitting [1]

1. what is overfitting
It
A good performance in a given exercise set makes the model too complicated.

The harm of 2. over fitting

It performs well on a given exercise set, but performs poorly on the test set, that is, generalization ability is very weak.

3. how to deal with over fitting

(1) adjust existing data to add noise to existing data.

(2) use dropout

Dropout refers to a model that closes or ignores certain nodes at a rate of 50% each time during training. This can be understood as a different model in the process of exercise. Different models produce different results. With the discontinuation of the exercise, the results will be shaken in one range, but the mean value will not vary greatly. (me: different nodes on the same level can be viewed as independent and identically distributed).

(3) regularization

Regularization is also called weight attenuation. The center of the neural network is the backpropagation of error, and the error will affect the weight updating of each neuron. The cumulative error also exists with the suspension of the exercise. Regularization is to stop a certain degree of attenuation at the time of weight updating, thus reducing the effect of cumulative error on weight updating, thus preventing over fitting.

(4) early termination of Early Stopping

After the end of each cycle, the accuracy of the model was calculated, and when the 10 accuracy rate continued to waver within a range, the exercise was stopped.

(5) use Batch_Normalization

After the data is accumulated, the Batch Normalization is required before entering the activation function. In batch, the data is calculated with the mean and variance of the input data. The advantage of this method is that both the exercise set and the test set are shaken in a certain range, that is, to cover up the error suspension contained in the data.

(6) other methods

Cross examination, PCA dimensionality reduction.

Two. The effect of the full cohesive layer [1]

In the convolution neural network, the convolution layer is the trick of feature extraction. At the end of the output, the whole join layer is the data classification layer. Therefore, it can be said that the full cohesion layer plays a "Classifier" function in the whole convolution neural network.

Three. The role of activation function [2]

The activation function is to introduce nonlinear elements to deal with the problem that the linear model can not handle. Assuming that the activation function is not applied, the model we get is linear or just a complex linear combination.

Four. The role of the pooled layer [1]

1. further reduce dimension of information extracted from convolution layer, so as to reduce computation.

2., we must enhance the invariance of image features so as to increase the robustness to image migration and rotation.

PREVIOUS：Image Recognition Feature Extraction NEXT：Deep learning convolution network detail