Convolutional Neural Networks (CNN) and Feature Extraction
Convolutional Neural Networks allow us to extract a wide range of features from images. Turns out, we can use this idea of feature extraction for face recognition too! That’s what we are going to explore in this tutorial, using deep conv nets for face recognition. Note: this is face recognition (i.e. actually telling whose face it is), not just detection (i.e. identifying faces in a picture).
If you don’t know what deep learning is (or what neural networks are) please read my post Deep Learning For Beginners. If you want to try out a basic tutorial on image classification using convolutional neural networks, you can try this tutorial. Please remember that this tutorial assumes that you have basic programming experience (preferably with Python) and that you understand the basic idea of deep learning and neural networks.
The approach we are going to use for face recognition is fairly straight forward. The key here is to get a deep neural network to produce a bunch of numbers that describe a face (known as face encodings). When you pass in two different images of the same person, the network should return similar outputs (i.e. closer numbers) for both images, whereas when you pass in images of two different people, the network should return very different outputs for the two images. This means that the neural network needs to be trained to automatically identify different features of faces and calculate numbers based on that. The output of the neural network can be thought of as an identifier for a particular person’s face — if you pass in different images of the same person, the output of the neural network will be very similar/close, whereas if you pass in images of a different person, the output will be very different.
Thankfully, we don’t have to go through the hassle of training or building our own neural network. We have access to a trained model through dlib that we can use. It does exactly what we need it to do — outputs a bunch of numbers (face encodings) when we pass in the image of someone’s face; comparing face encodings of faces from different images will tell us if someone’s face matches with anyone we have images of. Here are the steps we will be taking:
-
Detect/identify faces in an image (using a face detection model) — for simplicity, this tutorial will only use images with one face/person in it, not more/less
-
Predict face poses/landmarks (for the faces identified in step 1)
-
Using data from step 2 and the actual image, calculate face encodings (numbers that describe the face)
-
Compare the face encodings of known faces with those from test images to tell who is in the picture
Hopefully you get the basic idea of how this will work (of course the description above is a very simplified one). Now it’s time to start building!
Preparing Images
Firstly, create a project folder (just a folder in which we will keep our code and images). For me it’s called face_recognition but you can call it whatever you like. Inside that folder, create another folder called images . This is the folder that will hold images of the different people you want to run face recognition on. Download some pictures of your friends (one picture per person) from Facebook, rename the picture to your friend’s name (e.g. taus.jpg or john.jpg ) and store all of them in this images folder you just created. One important thing to remember: please make sure that all of those images only have ONE face in them (i.e. they can’t be group pictures) and they are all in JPEG format with filenames ending in .jpg.
Next, create another folder inside your project folder (the face_recognition folder for me) and name it test . This folder will contain different images of the same people whose pictures you stored in the images folder. Again, make sure that each picture only has one person in it. In the test folder, you can name the image files whatever you like and you can have multiple pictures of each person (because we will run face recognition on all pictures in the test folder).
Installing Dependencies
The most important dependencies for this project are Python 2.7 and pip. You can install both (if you don’t have it already) using Anaconda 2 (which is just a Python distribution that comes pre-packaged with pip) by following this link. Note: Please make sure that Anaconda 2 is added to your PATH and that it’s registered as your system Python 2.7 (there should be a prompt regarding this during the installation process; just press Yes or check the checkbox).
If you are done setting up Anaconda 2 or if you had Python 2.7 and pip installed on your machine beforehand, you can go ahead and install dlib (the machine learning library we will be using) and other dependencies. To do so, type in the following command in Terminal (Mac OS or Linux) or Command Prompt (Windows):
pip install --user numpy scipy dlib
If you are a Mac or Linux user and if you run into issues with the command stated above, please try this command instead:
sudo pip install --user numpy scipy dlib
If the process stated above does not work for you, you may have to manually download, compile and install dlib with its Python API. To do so, you have to do some reading on http://dlib.net/. Unfortunately, it is beyond the scope of this blog post and hence I won’t be covering that here.
One last thing you need to do is download the pre-trained models for face recognition. There are two models that you need. One model predicts the shape/pose of a face (basically gives you numbers on how the shape is positioned in the image). The other model, takes faces and gives you face encodings (basically numbers that describe the face of that particular person). Here are instructions on how to download, extract and prepare them for our purpose:
-
Download dlib_face_recognition_resnet_model_v1.dat.bz2 from this link and shape_predictor_68_face_landmarks.dat.bz2 from this link
-
Once you have both of those two files downloaded, you need to extract them (they are compressed in bz2 format). On Windows, you can use Easy 7-zip to do so. On Mac or Linux, you should be able to double-click on the files and extract them. If that doesn’t work, just type this into your Terminal for both of those files: bzip2 {PATH_TO_FILE} --decompress (replace {PATH_TO_FILE} with the actual path to the file you are trying to extract; for me the commands would be bzip2 ~/Downloads/dlib_face_recognition_resnet_model_v1.dat.bz2 --decompress and bzip2 ~/Downloads/shape_predictor_68_face_landmarks.dat.bz2 --decompress ).
-
Once you extract them, you should have two files named dlib_face_recognition_resnet_model_v1.dat and shape_predictor_68_face_landmarks.dat. Copy those two files into your project folder (for me it would be the face_recognition folder I created for this project).
Code!
Now, that you have everything set up, open your project folder (calledface_recognitionfor me) in a text editor (preferably Atom or Sublime Text). Create a new file in that folder called recognize.py. This is where we will add the code to match faces of your friends. Note, that there are two main parts of this process: first, load face encodings of the known faces in the images folder; once that’s done, get face encodings from the faces/images stored in the test folder and match them with all of our known faces one by one. We will do this part step by step. If you want to see it running, you can copy paste all the code in this section in your file one after another (i.e. merge all the separate sections of code in the same order that they are listed below). Carefully read the comments in each code block to understand what it does.
Part 1: Initialize and Setup Here we import the required library and set up the objects/parameters needed for our face recognition.
import dlib
import scipy.misc
import numpy as np
import os
# Get Face Detector from dlib
# This allows us to detect faces in images
face_detector = dlib.get_frontal_face_detector()
# Get Pose Predictor from dlib
# This allows us to detect landmark points in faces and understand the pose/angle of the face
shape_predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')
# Get the face recognition model
# This is what gives us the face encodings (numbers that identify the face of a particular person)
face_recognition_model = dlib.face_recognition_model_v1('dlib_face_recognition_resnet_model_v1.dat')
# This is the tolerance for face comparisons
# The lower the number - the stricter the comparison
# To avoid false matches, use lower value
# To avoid false negatives (i.e. faces of the same person doesn't match), use higher value
# 0.5-0.6 works well
TOLERANCE = 0.6
Part 2: Get face encodings from an image Here we are writing the function that will take an image filename and give us the face encodings for that image.
# This function will take an image and return its face encodings using the neural network
def get_face_encodings(path_to_image):
# Load image using scipy
image = scipy.misc.imread(path_to_image)
# Detect faces using the face detector
detected_faces = face_detector(image, 1)
# Get pose/landmarks of those faces
# Will be used as an input to the function that computes face encodings
# This allows the neural network to be able to produce similar numbers for faces of the same people, regardless of camera angle and/or face positioning in the image
shapes_faces = [shape_predictor(image, face) for face in detected_faces]
# For every face detected, compute the face encodings
return [np.array(face_recognition_model.compute_face_descriptor(image, face_pose, 1)) for face_pose in shapes_faces]
Part 3a: Compare faces Here we are writing the function that will compare a given face encoding with a list of known face encodings. It will return an array of boolean (True/False) values that indicate whether or not there was a match.
# This function takes a list of known faces
def compare_face_encodings(known_faces, face):
# Finds the difference between each known face and the given face (that we are comparing)
# Calculate norm for the differences with each known face
# Return an array with True/Face values based on whether or not a known face matched with the given face
# A match occurs when the (norm) difference between a known face and the given face is less than or equal to the TOLERANCE value
return (np.linalg.norm(known_faces - face, axis=1) <= TOLERANCE)
Part 3b: Find match Here we are writing the function that will take a list of known face encodings, list of names of people (corresponding to the list of known face encodings) and a face to find a match for. It will call the function in 3a and return the name of the person with whom the given face matches.
# This function returns the name of the person whose image matches with the given face (or 'Not Found')
# known_faces is a list of face encodings
# names is a list of the names of people (in the same order as the face encodings - to match the name with an encoding)
# face is the face we are looking for
def find_match(known_faces, names, face):
# Call compare_face_encodings to get a list of True/False values indicating whether or not there's a match
matches = compare_face_encodings(known_faces, face)
# Return the name of the first match
count = 0
for match in matches:
if match:
return names[count]
count += 1
# Return not found if no match found
return 'Not Found'
At this point, we have the functions we need to run our program. It’s time to code the final part of our application (which I’ll divide in two separate parts).
Part 4a: Getting face encodings for all faces in the images folder
# Get path to all the known images
# Filtering on .jpg extension - so this will only work with JPEG images ending with .jpg
image_filenames = filter(lambda x: x.endswith('.jpg'), os.listdir('images/'))
# Sort in alphabetical order
image_filenames = sorted(image_filenames)
# Get full paths to images
paths_to_images = ['images/' + x for x in image_filenames]
# List of face encodings we have
face_encodings = []
# Loop over images to get the encoding one by one
for path_to_image in paths_to_images:
# Get face encodings from the image
face_encodings_in_image = get_face_encodings(path_to_image)
# Make sure there's exactly one face in the image
if len(face_encodings_in_image) != 1:
print("Please change image: " + path_to_image + " - it has " + str(len(face_encodings_in_image)) + " faces; it can only have one")
exit()
# Append the face encoding found in that image to the list of face encodings we have
face_encodings.append(get_face_encodings(path_to_image)[0])
Part 4b: Matching each image in test folder with the known faces (one by one)
# Get path to all the test images
# Filtering on .jpg extension - so this will only work with JPEG images ending with .jpg
test_filenames = filter(lambda x: x.endswith('.jpg'), os.listdir('test/'))
# Get full paths to test images
paths_to_test_images = ['test/' + x for x in test_filenames]
# Get list of names of people by eliminating the .JPG extension from image filenames
names = [x[:-4] for x in image_filenames]
# Iterate over test images to find match one by one
for path_to_image in paths_to_test_images:
# Get face encodings from the test image
face_encodings_in_image = get_face_encodings(path_to_image)
# Make sure there's exactly one face in the image
if len(face_encodings_in_image) != 1:
print("Please change image: " + path_to_image + " - it has " + str(len(face_encodings_in_image)) + " faces; it can only have one")
exit()
# Find match for the face encoding found in this test image
match = find_match(face_encodings, names, face_encodings_in_image[0])
# Print the path of test image and the corresponding match
print(path_to_image, match)
That’s it! Once you copy paste all the code from parts 1 to 4b (one after another — in the same order as I wrote them) into the recognize.py file, you should be able to run it using your Terminal (Mac OS or Linux) or Command Prompt (Windows) by typing in these commands (replace {PROJECT_FOLDER_PATH} with the full path to your project folder; for me it is /Users/taus/face_recognition ):
cd **{PROJECT_FOLDER_PATH}
**python recognize.py
This should give you an output similar to this:
('test/1.jpg', 'Motasim')
('test/2.jpg', 'Not Found')
('test/3.jpg', 'Taus')
('test/4.jpg', 'Sania')
('test/5.jpg', 'Mubin')
The name beside the filename shows the name of the person with whom the given face has matched. Note that this might not work too well on all images. For optimum performance with this code, try using images that have the face of the person clearly visible. Of course there are other ways of making it accurate (like by actually changing our code to check against multiple images or using jitters, etc.) but the point of this is to just give you a basic idea of how face recognition works.
This post was inspired by Adam Geitgey so special thanks to him for his blog post and Github repo on face recognition. Also, we are using dlib and some pre-trained models available on dlib’s website —so kudos to them for making them publicly accessible. My main goal was to introduce and explain a basic deep learning solution for face recognition. Of course, there are easier ways to do the same thing, but I thought I should do this part by part (and in detail) using dlib so you actually understand the different moving parts. There are other ways of running face recognition too (non-deep learning), feel free to look into them. The cool thing about this approach is that you can run it with just one or two images per person/subject (given the model does a pretty good job at actually telling two faces apart).
Regardless, I hope you liked this post. Feel free to reach out with comments.
µ