MNIST Handwritten Number Recognition Using Keras - With Live Predictor - With Source Code - 2025

When starting with Machine Learning, MNIST Handwritten number recognition comes as the first project in everyone’s mind because of its simplicity, abundant data, and magical results. It can also be thought of as the ‘Hello World of ML world.

So, In today’s blog, we will see how to implement it. You might be thinking, everyone has made a tutorial on it, so what’s special in this one? The special thing in my project is that I have also made a live interactive predictor at the end, where you will draw the number and our trained model will predict it. So without any further due, Let’s do it…

Table of Contents

Step 1 – Import all the required libraries.

import cv2
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D,Flatten,Dropout,Dense,MaxPooling2D
from tensorflow.keras.optimizers import SGD
from keras.utils import np_utils 

%matplotlib inline

Step 2 – Import training and testing data.

(train_X,train_y),(test_X,test_y) = mnist.load_data()
train_X.shape

shape of train_X

Step 3 – Here we are just randomly visualizing our 28X28 images.

plt.imshow(train_X[1],cmap='gray')

train_y

Let’s see how train_y looks. train_y contains exact numbers written in the image. Like the image on index 1 is shown above and the number on the 1st index in train_y is 0. But we can.t give data in this format to our data, we need to one-hot encode it.

Step 4 – Preprocess our training and testing data.

train_X = train_X.reshape(-1,28,28,1)
test_X  = test_X.reshape(-1,28,28,1)

train_X = train_X.astype('float32')
test_X  = test_X.astype('float32')

train_X = train_X/255
test_X  = test_X/255

train_y = np_utils.to_categorical(train_y)
test_y  = np_utils.to_categorical(test_y)

Line 1-2 – Our input shape was (60000,28,28), but Keras wants the data in (60000,28,28,1) format, that’s why we are reshaping it.
Line 4-5 – We are converting the integer labels to floats, like the element at 1st index was 1 which is an int , now it will be 1.0 which is a float.
Line 7-8 – We are normalizing our image, as we know our grayscale image is just a 2D image with all values ranging from 0-255, that’s why we are diving this image/2D matrix by 255 to bring everything in range 0-1.
Line 10-11 – One Hot encode the labels. Previously at 1st index, we had 1 but now a 10-element array will be there which is [1,0,0,0,0,0,0,0,0,0] where 1 depicts the presence of that number. 1 is present at the 0 index means this array represents the number 0. This whole thing is shown below.

train_y[1]

Step 5 – Let’s create our CNN model for MNIST Handwritten number recognition.

model = Sequential()

model.add(Conv2D(32, kernel_size=(3,3), activation='relu',input_shape=(28,28,1), padding='SAME'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64,(3,3),activation='relu',padding='SAME'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(10,activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer=SGD(0.01), metrics=['accuracy'])

print(model.summary())

Line 1 – Let’s create a new Sequential model and call it model.
Line 3 – Let’s add a 2D convolution layer with 32 filters, kernel size of 3X3, activation function set to ReLU, input shape of 28X28X1 (shape of our mnist images), and padding set to SAME.
Line 4 – Let’s add a MaxPooling layer with pool size of 2X2.
Line 6-7 – Let’s add one more set of Conv and Max pool.
Line 8 – We are adding a Dropout layer, which will help us in preventing Overfitting.
Line 10 – Let’s flatten these 2D results.
Line 11-12 – Add a Dense layer(a simple layer with n neurons in it), with 128 neurons in it and activation set to ReLU. Add a Dropout layer.
Line 14 – Last layer with 10 neurons in it, with softmax activation. Softmax works best with multilabel classification. It will give the highest probability to that neuron which will be at the index of the image provided. Like if the image provided is of 5, the output could be something like [0.001, 0.0003, 0.0, 0.0, 0.99, 0.008, 0.00065, 0.0, 0.0, 0.0]

Step 6 -Let’s train our MNIST Handwritten number recognition model.

batch_size=32
epochs=10

plotting_data = model.fit(train_X,
                          train_y,
                          batch_size=batch_size,
                          epochs=epochs,
                          verbose=1,
                          validation_data=(test_X,test_y))

loss,accuracy = model.evaluate(test_X,test_y,verbose=0)

print('Test loss ---> ',str(round(loss*100,2)) +str('%'))
print('Test accuracy ---> ',str(round(accuracy*100,2)) +str('%'))

Simply just train the model using model.fit().
Use model.evaluate() to evaluate loss and accuracy and finally print it.

Step 7 – Plot the results for the MNIST Handwritten number recognition model.

plotting_data_dict = plotting_data.history

test_loss = plotting_data_dict['val_loss']
training_loss = plotting_data_dict['loss']
test_accuracy = plotting_data_dict['val_accuracy']
training_accuracy = plotting_data_dict['accuracy']

epochs = range(1,len(test_loss)+1)

plt.plot(epochs,test_loss,marker='X',label='test_loss')
plt.plot(epochs,training_loss,marker='X',label='training_loss')
plt.legend()

In plotting_data we stored some data while using model.fit().
So we can use the history attribute of that plotting_data to get a dictionary object which had loss and accuracy values stored in it while training.

plt.plot(epochs,test_accuracy,marker='X',label='test_accuracy')
plt.plot(epochs,training_accuracy,marker='X',label='training_accuracy')
plt.legend()

Step 8 – Saving our model for future use.

model.save('MNIST_10_epochs')
print('Model Saved !!!')

NOTE – Use the statement below to load the saved model anywhere and use it.

classifier = load_model('MNIST_10_epochs.h5')

Step 9 – Live predictions for MNIST Handwritten number recognition.

drawing=False
cv2.namedWindow('win')
black_image = np.zeros((256,256,3),np.uint8)
ix,iy=-1,-1

def draw_circles(event,x,y,flags,param):
    global ix,iy,drawing
    if event== cv2.EVENT_LBUTTONDOWN:
        drawing=True
        ix,iy=x,y
        
    elif event==cv2.EVENT_MOUSEMOVE:
        if drawing==True:
            cv2.circle(black_image,(x,y),5,(255,255,255),-1)
            
    elif event==cv2.EVENT_LBUTTONUP:
        drawing = False
        
cv2.setMouseCallback('win',draw_circles)

while True:
    cv2.imshow('win',black_image)
    if cv2.waitKey(1)==27:
        break
    elif cv2.waitKey(1)==13:
        input_img = cv2.resize(black_image,(28,28))
        input_img = cv2.cvtColor(input_img,cv2.COLOR_BGR2GRAY)
        input_img = input_img.reshape(1,28,28,1)
        res = classifier.predict_classes(input_img,1,verbose=0)[0]
        cv2.putText(black_image,text=str(res),org=(205,30),fontFace=cv2.FONT_HERSHEY_SIMPLEX,fontScale=1,color=(255,255,255),thickness=2)
    elif cv2.waitKey(1)==ord('c'):
        black_image = np.zeros((256,256,3),np.uint8)
        ix,iy=-1,-1
cv2.destroyAllWindows()

Line 1-19 – Refer to my HOW TO DRAW BASIC SHAPES ON IMAGES IN PYTHON USING OPENCV blog to understand how we are drawing number.

Line 23-24 – If we hit the ESC key, break the code.
Line 25-30 – after drawing if we hit ENTER key, perform preprocessings and use our pre-trained model to predict the number.
Line 31-33 – If someone hits ‘C’ key, clear the screen.