Facial Emotion Recognition for Mental Health Monitoring

Precious Orekha
6 min readOct 30, 2023

--

A personal project as part of my KaggleX BIPOC Mentorship Program journey

Pictorial Representation of practical testing of model

Data Source: https://www.kaggle.com/datasets/jonathanoheix/face-expression-recognition-dataset

You will agree with me that facial expressions serve as both a natural means of conveying human emotions and a vital non-verbal communication method. The way emotional facial expressions are interpreted can be influenced by personality traits, such as neuroticism, which is one of the dimensions in the five-factor model of personality.

Neuroticism is associated with traits like anxiety, nervousness, and hostility.

People dealing with mental health issues often exhibit non-verbal signs of distress, such as changes in facial expression, body language, and overall demeanor. Understanding these non-verbal cues can help individuals and support networks recognize when someone is struggling and may need assistance or support.

With this concise introduction, I believe we have acquired a glimpse into the concept of Facial Emotion Recognition. For the sake of clarity, here is a definition that resonates with me.

Facial Emotion Recognition (FER) is the technology that analyses facial expressions from both static images and videos in order to reveal information on one’s emotional state.

With that said, let’s move on to the core of the project. This topic emerged from a personal quest to explore the factors of non-verbal indicators such as anxiety, depression, mood fluctuations, loneliness, and more, which are steadily evolving into prominent silent threats in today’s society. This project also serves as an educational journey to delve into the intricacies of Image Processing in conjunction with computer vision and deep learning applications.

On a slightly different note, this project presented unique challenges. While pursuing my master’s degree in Data Science at Drexel University in the United States, I was juggling adjusting to new a weather, time zone differences, and cultural immersion. Meeting school deadlines added to the busyness. This experience reinforced the importance of understanding and recognizing facial emotions, as cultural nuances and emotional expressions can vary significantly across different regions and contexts. It highlighted the need for accurate and adaptable emotion recognition technology to bridge these cultural and geographical gap.

PROJECT LIFE CYCLE — CNN Model

Convolutional Neural Network (CNNs) are a fundamental component in modern facial recognition systems due to their ability to automatically learn and extract relevant features from facial images, their robustness to variations, and their effectiveness in handling the complexities of facial recognition tasks were reasons for choosing CNN as compared to some other models such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Multi-Layer Perceptron, and Recurrent Neural Networks (RNNs).

Image source from Semantic Scholar

A simplified explanation of each stage in the CNN Facial Recognition project life cycle:

  • Input Images: Start with photos of faces as our input data.
  • Haar Cascade: Detect faces in images using a Haar Cascade, a method for face detection.
  • Crop Image: Isolate the detected face for further analysis.
  • Convert to Grey Scale (Resize 64x64): Simplify the image by making it grayscale and resizing it to a standard size (e.g., 64x64 pixels). After detecting the face in a photo, it’s common to convert the image to grayscale, removing color information. Grayscale simplifies the data and reduces computational load.
  • Convolutional Layers (2x2): Convolutional layers are where the neural network “looks” at the image in small pieces. A 2x2 convolution scans a 2x2 grid of pixels at a time. These layers apply filters to detect patterns, edges, textures, or more complex features in the grayscale face image.
  • Max Pooling Layers (2x2): Max pooling is used to reduce the size of the feature maps produced by convolutional layers. A 2x2 max pooling layer selects the maximum value from each 2x2 grid. This reduces computational complexity and focuses on the most relevant features while maintaining spatial relationships in the data.
  • Fully Connected Layers (300 Nodes): Fully connected layers connect all the neurons (nodes) from the previous layer to each neuron in the current layer. In the context of facial recognition, these layers analyze the features extracted by the convolutional layers and the pooling layers and perform more complex feature analysis.
  • Softmax Classifier : The softmax classifier is used to categorize the face into one of seven emotion classes: Happy, Sad, Neutral, Angry, Disgust, Surprised, or Afraid. Softmax here assigns probabilities to each class, and the class with the highest probability is the predicted emotion for the detected face.

Model Training and Validation

Model Result

Plot of Training and Validation Loss/Accuracy
Live Presentation of FER Model’s Performance

Summary

Facial expression prediction is crucial for communication and rehabilitation. This project aims to use neural networks to classify facial emotions, helping in behavior studies.

It is essential to recognize that emotion recognition plays a pivotal role in various domains, including;

  • Health Care — (mental health monitoring)
  • Automotive Industry — (detect driver drowsiness, stress, etc.)
  • Autism Spectrum Disorder (ASD) Diagnosis
  • Education - Help in adaptive learning systems. It can assess students’ engagement, frustration, or confusion and adjust the content or teaching approach accordingly
  • Market Research: Employed in market research to gauge consumer reactions to products, advertisements, and services
  • Security: Can enhance security systems. It can help identify suspicious behavior or individuals who might pose a threat based on their emotional cues

References

  • Real-Time Emotion Recognition System using Facial Recognition link source
  • Development of a Real-Time Emotion Recognition System link source
  • Patient Monitoring link
  • Patient Monitoring using Emotion Recognition link source
  • Mental Health Monitoring link source
  • Face Recognition in 46 lines of code medium
  • Face Recognition with OpenCv, Python and Deep Learning Pyimagesearch
  • Facial Emotion Recognition FER

Appreciation

Undertaking this project was both a challenge and a valuable learning experience. It required me to navigate personal commitments, adapt to time zone differences, and manage the demands of my academic pursuit in the United States, where I am currently pursuing a master’s degree.

During this journey, my mentor Ayogeboh Epizitone played a pivotal role in ensuring the success of the project. Despite the challenges I faced, she provided unwavering guidance and understanding. Her insights and mentorship were instrumental in overcoming obstacles, making critical project decisions, and achieving the desired outcomes.

I must acknowledge the significant role played by the Kaggle Team (KaggleX) Mentorship program in making this project a reality. Through a range of virtual learning sessions, access to valuable free resources, thoughtful gifts, and the unwavering support of their dedicated staff, I am genuinely appreciative. The program’s contributions have been instrumental in the success of this project, and I extend my heartfelt gratitude to everyone involved

THANK YOUU!!!!!

--

--

Precious Orekha
Precious Orekha

Written by Precious Orekha

A Data Science Master's candidate at Drexel University with interests in Data Science/Engineering and Technology in general. I write to anchor my thoughts.

Responses (2)