Handwritten digit recognition technology based on deep learning

News classification

Contact us

Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
Tel: 13146317170 廖经理
Fax:
Email: 398017534@qq.com

Since Python has a pickle function, it converts an object into a file. So the time of reading the data in the second and later will be very fast.

If you are interested in Python's pickle function, you can interrogate your beloved.

--------Show an image to verify this data set --------

Now write a program that shows a picture to prove that this data set download wins and is valid.

Import sys, os
Sys.path.append(os.pardir)
Import numpy as np
From dataset.mnist import load_mnist
From PIL import Image
Def img_show(img):
# Convert the img array to an object that Image can read
Pil_img = Image.fromarray(np.uint8(img))
#展示这图片对象
Pil_img.show()
#Get data set
#flatten=True represents the requirement to get 1D of data, normalize=False means not to be tracked (between 0-1)
(x_train, t_train), (x_test, t_test) = load_mnist(flatten=True,
Normalize=False)
Img = x_train[0]
Label = t_train[0]
Print(label) # 5
Print(img.shape) # (784,)
# Since the data is 1 dimensionalized, it is required to convert the dimension to an image size of 28*28.
Img = img.reshape(28, 28) # Change the shape of the image to its original size
Print(img.shape) # (28, 28)
Img_show(img)

This is where the code is stored:

After running the victory:

The acquisition and verification of the data is partially completed. The reasoning of the neural network is stopped below.

-------- Neural network reasoning --------

Introduce the composition of this neural network:

Input layer: 28*28=784 neurons

Output layer: 0-9 a total of 10 categories. So there are 10 neurons.

In addition, this neural network has 2 hiding layers, and the first hiding layer has 50 neurons.

The second hiding layer has 100 neurons. This 50 and 100 can be set to any value.

The difference between the reasoning process and the "Deep Learning Theory Fundamentals 10 - Completing a 3-Layer Neural Network" is that

The previous weights are scribbled, and the weights in this section are exercised (of course, we are not exercising, this section is only used directly)

Since these weights are valid, we can see the effect in this section. To see the effect, that is, the correct rate,

Write the gap between the results of this neural network prediction and the actual results on the requirements, which is also the procedure for completing the requirements of this section.

Here's a look at what the requirements do:

1. Prepare a function to get the data set

2. Prepare a function to get weight information

3. Prepare a neural network function that stops reasoning through data sets and weight information

4. Prepare a function that counts the same number of real results as the predicted result

5. Run the main function to see the effect

Before doing this, import the necessary modules first.

From yuan.dataset.mnist import load_mnist
Import pickle
Import numpy as np
For convenience, the yuan folder has been created as a module (add an empty __init__.py file to this folder)

In addition, the neural network's reasoning requires activation functions, which were previously completed:

# hiding layer activation function
Def sigmoid_function(x):
Return 1/(1+np.exp(-x))
#output layer activation function
Def softmax(a):
Exp_a = np.exp(a)
Sum_exp_a = np.sum(exp_a)
y = exp_a / sum_exp_a
Return y
After doing this, start our program

The first step is to prepare a function that gets the data set.

# 1. Prepare a function to get the data set
Def get_data():
#过该功能,Get exercise set and test set
(x_train, t_train), (x_test, t_test) = \
Load_mnist(normalize=True, flatten=True, one_hot_label=False)
# Only use the test set later, so only return these two (because the model is well-trained, you can't use the exercise set)
Return x_test, t_test
The second step is to prepare a function to obtain weight information.

# 2. Prepare a function to get weight information
Def get_weight():
#You should ensure that the path to the sample_weight.pkl file is correct.
#本文件 is the file after the package is dumped by the pickle module (here we use it directly)
Sample_weight_path='./yuan/ch03/sample_weight.pkl'
With open(sample_weight_path, 'rb') as f:
#Restore this file as an instance
Network = pickle.load(f)
W1, W2, W3 = network['W1'], network['W2'], network['W3']
B1, b2, b3 = network['b1'], network['b2'], network['b3']
#returns the weight and offset of each layer
Return W1, W2, W3, b1, b2, b3
The third step is to prepare a neural network function that stops the reasoning through the data set and weight information.

# 3. Prepare a neural network function that stops reasoning through data sets and weight information
# This process is also called forward propagation
Def forward_propagation(x,network):
W1, W2, W3, b1, b2, b3=network
A1 = np.dot(x, W1) + b1
Z1 = sigmoid_function(a1)
A2 = np.dot(z1, W2) + b2
Z2 = sigmoid_function(a2)
A3 = np.dot(z2, W3) + b3
y = softmax(a3)
Return y
The fourth step is to prepare a function that counts the same number of real results as the predicted result.

# 4. Prepare a function that counts the same number of real results as the predicted result
Def get_accuracy():
Network=get_weight()
x, t = get_data()
#This variable is used to record the correct number of predictions
Accuracy_cnt = 0
For i in range(len(x)):
#one-by-one prediction results
y = forward_propagation(x[i],network)
p = np.argmax(y) # Get the index of the element with the highest probability
#Compare the prediction results with the real results
If p == t[i]:
Accuracy_cnt += 1
#回回比例比例
Return str(float(accuracy_cnt) / len(x))
The fifth step, run the main function, view the effect

# 5. Run the main function to see the effect
Def main():
Accuracy=get_accuracy()
Print(Accuracy) #output0.9352
If __name__=='__main__':
Main()

The final 0.9352 is the correct rate for this model. It can recognize 935 every 1000 digital pictures.

Not bad. This achievement has even surpassed the kindergarten children.

In the past few years, when the neural network just appeared, I saw the image recognition in the shell almost scared. This is too powerful.

I did not expect that I also wrote such a program at this moment (the practice is to use other people's data and models directly -->-> boasting has never been blushing).

In order to squeeze the last drop of numpy blood, the batch processing of the picture is stopped.

(Because np has some powerful algorithms that are very suitable for batch disposal, it is much faster than using a loop one by one.)

--------Batch disposal --------

Early warning: The matrix product multiplication is not well learned, please return to the furnace for reconstruction. Batch disposal is stopped by matrix product multiplication.

The above picture is a representation of the reason why the single picture is stopped.

The input can actually be seen as a rectangle of 1 row and 784 columns. Finally, a matrix of 1 row and 10 columns is output.

Batch processing, the part we have to do is to turn the input into multiple rows of 784 columns. In this way, the output will also become multiple rows and 10 columns.

Just like this:

So in the fourth step, there is an alternative plan:

#4_2. Batch processing image results
Def batch_get_accuracy():
Network=get_weight()
x, t = get_data()
#This variable is used to record the correct number of predictions
Batch_size = 100 # batch quantity
Accuracy_cnt = 0
For i in range(0, len(x), batch_size):
X_batch = x[i:i + batch_size]
Y_batch = forward_propagation(x_batch,network )
p = np.argmax(y_batch, axis=1)
Accuracy_cnt += np.sum(p == t[i:i + batch_size])
#回回比例比例
Return str(float(accuracy_cnt) / len(x))
It can be seen that the number of cycles is reduced by a factor of 100 (since the step size becomes 100)

Forward_propagation(x_batch,network ) This step, as depicted in the image above, returns a 100*10 shape
Np.argmax(y_batch, axis=1) takes the index of the maximum value of each row (that is, the index position where the most approximate rate is located, that is, the prediction result)
t[i:i + batch_size] is taken from the correct answer.
So the essence of p == t[i:i + batch_size] is that a series of predicted values is compared to a string of real values. The sum of the bool results is the correct amount.

PREVIOUS：A technology for automatically identifyi NEXT：Binocular live face capture machine