d

Face analysis based on deep learning

With the advancement of deep learning technology, the performance of face analysis has also improved.
I will introduce a high-performing open-source face analysis library and explain how to use it.

 

InsightFace: 2D and 3D Face Analysis Project

Here, we share various models, and I would like to introduce the buffalo_l model pack.
buffalo_l provides blob box, key points, detection score, landmark 2D/3D, gender, age, embedding, and pose information.
 
I will demonstrate the process of face analysis using Python.

Python package install

Please install the following packages.
(You do not need to install packages that are already installed.)

  • pip install numpy
  • pip install opencv-python
  • pip install pillow
  • pip install insightface
  • pip install onnxruntime

Model download

Download the model through the following link.

Unzip the downloaded file in the directory where the Python source code is located.
The tree structure should be as follows.

├─ source_code.py
├─ checkpoints
      └─ models
            └─ buffalo_l
                  └─ 1k3d68.onnx
                  └─ 2d106det.onnx
                  └─ det_10g.onnx
                  └─ genderage.onnx
                  └─ w600k_r50.onnx

Python code

The following code identifies faces and draws rectangles around them.
The information about the detected faces is contained in faces.
It can detect multiple faces, and since faces is a list, you can access each face's information using an index.

import cv2
import insightface
import onnxruntime
import numpy as np
from PIL import Image

providers = onnxruntime.get_available_providers()
face_analyser = insightface.app.FaceAnalysis(name="buffalo_l", root="./checkpoints", providers=providers)

det_size=(320, 320)
face_analyser.prepare(ctx_id=0, det_size=det_size)

img = Image.open('face.jpg')
img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
    
faces = face_analyser.get(img)
print(faces)

if len(faces) >= 1:
    for face in faces:
        x1 = round(face['bbox'][0])
        y1 = round(face['bbox'][1])
        x2 = round(face['bbox'][2])
        y2 = round(face['bbox'][3])

        color_bgr = (0, 255, 0)
        box_thickness = 3
        cv2.rectangle(img, (x1, y1), (x2, y2), color_bgr, box_thickness)

    cv2.imwrite('output.jpg', img)
    
else:
    print('Face not found.')

I ran the code and it drew a square box around the face like this.

In the code above, the face contains the following nine types of analysis information.

  • face['bbox'] : Blob box (Square box face position)
  • face['kps'] : Key points (Eyes, Nose, Lips location)
  • face['det_score'] : Detection score (Score indicating the likelihood that the detected object is a face)
  • face['landmark_2d_106'] : face landmark 3d
  • face['landmark_3d_68'] : face landmark 3d (Dots representing the eyes, eyebrows, nose, mouth, and facial contours)
  • face['pose'] : Pitch, Yaw, Roll (Tilting the face up/down, rotating left/right, tilting left/right)
  • face['gender'] : Gender
  • face['age'] : Age
  • face['embedding'] : embedding vector

Since we have already checked the bbox, let's look at the remaining eight.

kps (Key points)

kps indicates the positions of the eyes, nose, and mouth.
When dots are placed at the coordinates of kps, they are represented as follows.

det_score (Detection score)

The det_score represents the likelihood that the detected object is a face.
It has a real-number value between 0 and 1, with values closer to 1 indicating a higher probability that the object is a face.

landmark_2d_106

These are points representing the eyes, eyebrows, nose, mouth, and facial outline in a 2D coordinate system.
Here is how they are represented.

landmark_3d_68

These points represent the eyes, eyebrows, nose, mouth, and facial contours in a 3D coordinate system.
They are expressed as follows.

pose

This indicates the Pitch, Yaw, and Roll information of a face.

  • Pitch: Tilt of the face up or down (positive: up, negative: down)
  • Yaw: Rotation of the face left or right (positive: left, negative: right)
  • Roll: Tilt of the face left or right (positive: left, negative: right)

gender

Here is the predicted gender.

age

Here is the estimated age.

embedding

This is a unique 512-dimensional vector for a face.
To determine if two faces belong to the same person, you can calculate the similarity between the two facial embedding vectors.
For example, by calculating the cosine similarity, the closer it is to 1, the more likely it is that the faces are of the same person.