Deploying Deep Learning Models Part 1: Preparing the Model

In this tutorial we'll see how you can take your work and give it an audience by deploying your projects on the web

5 years ago   •   8 min read

By Vihar Kurama

Whether you're working locally or on the cloud, many machine learning engineers don't have experience actually deploying their models so that they can be used on a global scale. In this tutorial we'll see how you can take your work and give it an audience by deploying your projects on the web. We'll start by creating a simple model which recognizes handwritten digits. Then we'll see step-by-step how to create an interface for deploying it on the web using Flask, a micro web framework written in Python.

Quickly Building a Model: CNN with MNIST

Before we dive into deploying models to production, let's begin by creating a simple model which we can save and deploy. If you've already built your own model, feel free to skip below to Saving Trained Models with h5py or Creating a Flask App for Serving the Model. For our purposes we'll start with a simple use case of creating a deep learning model using the MNIST dataset to recognize handwritten digits. This will give us a glance at how to define network architectures from scratch, then train, evaluate, and save them for deployment.

A convolutional neural network (CNN) is used for the task of handwriting recognition, as well as most image recognition tasks. The image is first sent through different convolutional layers, where the features are extracted and identified by the neurons. Whenever the network encounters a pattern in the test set which has features similar to the ones it learned in training, it will classify that image to the corresponding output label.

Let's now implement the algorithm using the Keras deep learning framework in 8 simple steps.

Bring this project to life

Step 1: Importing Necessary Modules and Layers

We always begin by importing all the modules and functions we'll use. This neural network is implemented in Keras (this comes pre-installed on Paperspace, but if you're running this locally you can always install Keras from your command line with pip install Keras). Next, we import the model and layers which we will use for building the neural network architecture, which in this case is a CNN.

# imports

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

Step 2: Defining Hyperparameters

Choosing the hyperparameters for your network can be a challenging task. Without going into too much theory or testing many different values, here we use standard values for the batch size (which defines the number of training samples to work through before updating the model weights) and number of epochs (full presentations of the data in the training set for learning). There are 10 classes since we're considering the digits 1-10.

# Hyperparameters

num_classes = 10
batch_size = 128
epochs = 12

Step 3: Loading the Images

The next step is to load our data set and set constant image sizes for our training process. The images sizes are fixed to (28 x 28), as the network input parameters are always constant (you can’t train your network with different dimensions). We simply load our MNIST dataset with a load method on the MNIST class which was imported in Step 1.

# Image Resolution

img_rows, img_cols = 28, 28

# Loading the data.

(x_train, y_train), (x_test, y_test) = mnist.load_data()

Step 4: Data Pre-Processing

In this step we need to make sure that the training data is pre-processed and tuned to the same direction; if your inputs are of different sizes, the performance of your network will be inaccurate. We use a simple reshape method on every image and iterate it over the complete data set. Next, we assign the respected label to every image for the training process, in this case, we use the to_categorical method to assign a label to every image.

# Preparing the data

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)


x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Step 5: Defining the Architecture

With the Keras framework we can easily declare a model by sequentially adding the layers. We use the add() method for this.

# Creating the Model 

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

Step 6: The Training Loop

Next we fit the model with the declared hyperparameters and initiate the training process. This can be simply done by using the model.fit() method and passing the parameters.

# Training the Model

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

Step 7: Evaluating the Model

# Evaluating the Predictions on the Model

score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Step 8: Saving the Model

# Saving the model for Future Inferences

model_json = model.to_json()
with open("model.json", "w") as json_file:
    json_file.write(model_json)
# serialize weights to HDF5
model.save_weights("model.h5")

Upon running this program and successful training, you will find two files in the same directory:

  1. model.json
  2. model.h5

The model.h5 file is a binary file which holds the weights. The file model.json is the architecture of the model that you just built.

Saving Trained Models With h5py

The HDF5 library lets users store huge amounts of numerical data, and easily manipulate that data with NumPy. For example, you can slice into multi-terabyte data sets stored on disk as if they were real NumPy arrays. Thousands of data sets can be stored in a single file, categorized and tagged however you want.

The save_weights method is added above in order to save the weights learned by the network using h5py. The h5py package is a Pythonic interface to the HDF5 binary data format.

Now that we have saved our model in HDF5 format we can load the weights whenever we want, and apply it to future tasks. To load the weights we'll also need to have the corresponding model architecture defined. Let's do this from a JSON file we previously used. Once the model is prepared with the trained weights, we're ready to use it for inference.

# imports

from keras import model_from_json 

# opening and store file in a variable

json_file = open('model.json','r')
loaded_model_json = json_file.read()
json_file.close()

# use Keras model_from_json to make a loaded model

loaded_model = model_from_json(loaded_model_json)

# load weights into new model

loaded_model.load_weights("model.h5")
print("Loaded Model from disk")

# compile and evaluate loaded model

loaded_model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

Now that we have the model saved along with the weights learned from training, we can use them to do inference on new data. This is how we make our trained models reusable.

Creating a Flask App for Serving the Model

To serve the saved model we'll use Flask, a micro web framework written in Python (it's referred to as a "micro" framework because it doesn't require particular tools or libraries).

To create our web app that recognizes different handwritten digits, we need two routes on our flask app:

  1. An index page route for the users drawing the image
  2. A predict route to make inferences from our saved model

These are defined below.

from flask import Flask, render_template, request

@app.route('/')
def index_view():
    return render_template('index.html')


@app.route('/predict/',methods=['GET','POST'])
def predict():
	response = "For ML Prediction"
return response	

if __name__ == '__main__':
    app.run(debug=True, port=8000)

Now, let's go ahead an implement our complete app.py. The predict function should take an image drawn by users and send it to the model. In our case, the image is a NumPy array containing the pixel intensities.

from flask import Flask, render_template, request
from scipy.misc import imsave, imread, imresize
import numpy as np
import keras.models
import re
import sys 
import os
import base64
sys.path.append(os.path.abspath("./model"))
from load import * 


global graph, model

model, graph = init()

app = Flask(__name__)


@app.route('/')
def index_view():
    return render_template('index.html')

def convertImage(imgData1):
	imgstr = re.search(b'base64,(.*)',imgData1).group(1)
	with open('output.png','wb') as output:
	    output.write(base64.b64decode(imgstr))

@app.route('/predict/',methods=['GET','POST'])
def predict():
	imgData = request.get_data()
	convertImage(imgData)
	x = imread('output.png',mode='L')
	x = np.invert(x)
	x = imresize(x,(28,28))
	x = x.reshape(1,28,28,1)

	with graph.as_default():
		out = model.predict(x)
		print(out)
		print(np.argmax(out,axis=1))

		response = np.array_str(np.argmax(out,axis=1))
		return response	

if __name__ == '__main__':
    app.run(debug=True, port=8000)

Here we have the loader function, load.py:

import numpy as np
import keras.models
from keras.models import model_from_json
from scipy.misc import imread, imresize,imshow
import tensorflow as tf


def init(): 
	json_file = open('model.json','r')
	loaded_model_json = json_file.read()
	json_file.close()
	loaded_model = model_from_json(loaded_model_json)
	#load weights into new model
	loaded_model.load_weights("model.h5")
	print("Loaded Model from disk")

	#compile and evaluate loaded model
	loaded_model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
	#loss,accuracy = model.evaluate(X_test,y_test)
	#print('loss:', loss)
	#print('accuracy:', accuracy)
	graph = tf.get_default_graph()

	return loaded_model,graph

Before we dive into the last step of deploying into the cloud, let's create an interface which enables users to draw images from the browser. We'll use JavaScript and render a canvas on the HTML page. Below is the JavaScript snippet for rendering a Canvas for drawing.

(function()
{
	var canvas = document.querySelector( "#canvas" );
	var context = canvas.getContext( "2d" );
	canvas.width = 280;
	canvas.height = 280;
	var Mouse = { x: 0, y: 0 };
	var lastMouse = { x: 0, y: 0 };
	context.fillStyle="white";
	context.fillRect(0,0,canvas.width,canvas.height);
	context.color = "black";
	context.lineWidth = 6;
	context.lineJoin = context.lineCap = 'round';
	debug();
	canvas.addEventListener( "mousemove", function( e )
	{
		lastMouse.x = Mouse.x;
		lastMouse.y = Mouse.y;

		Mouse.x = e.pageX - this.offsetLeft;
		Mouse.y = e.pageY - this.offsetTop;

	}, false );

	canvas.addEventListener( "mousedown", function( e )
	{
		canvas.addEventListener( "mousemove", onPaint, false );

	}, false );

	canvas.addEventListener( "mouseup", function()
	{
		canvas.removeEventListener( "mousemove", onPaint, false );

	}, false );

	var onPaint = function()
	{	
		context.lineWidth = context.lineWidth;
		context.lineJoin = "round";
		context.lineCap = "round";
		context.strokeStyle = context.color;

		context.beginPath();
		context.moveTo( lastMouse.x, lastMouse.y );
		context.lineTo( Mouse.x, Mouse.y );
		context.closePath();
		context.stroke();
	};

	function debug()
	{
		/* CLEAR BUTTON */
		var clearButton = $( "#clearButton" );
		clearButton.on( "click", function()
		{
			context.clearRect( 0, 0, 280, 280 );
			context.fillStyle="white";
			context.fillRect(0,0,canvas.width,canvas.height);
			
		});
		$( "#colors" ).change(function()
		{
			var color = $( "#colors" ).val();
			context.color = color;
		});		
		$( "#lineWidth" ).change(function()
		{
			context.lineWidth = $( this ).val();
		});
	}
}());

Once you're done using this snippet in your HTML, by the end of this tutorial your directory structure should look like this:

ml-in-prod/
├── app.py
├── Procfile
├── requirements.txt
├── runtime.txt
├── model/
│ ├── model.json
│ ├── model.h5
│ └── load.py
├── templates/
│ ├── index.html
│ └── draw.html
└── static/
├── index.js
└── style.css

There you go! Your applications is up and running. In the next tutorial we'll see how to deploy it on Paperspace cloud GPUs to make the app more powerful, reliable and accessible.

Add speed and simplicity to your Machine Learning workflow today

Get startedContact Sales

Spread the word

Keep reading