Comparison and Analysis of CNN Models to Improve a Facial Emotion Classification Accuracy for Koreans and East Asians

Jun-Hyeong Lee; Ki-Sang Song

doi:10.18517/ijaseit.14.3.18078

DOI : https://doi.org/10.18517/ijaseit.14.3.18078

Comparison and Analysis of CNN Models to Improve a Facial Emotion Classification Accuracy for Koreans and East Asians

Jun-Hyeong Lee ⁽¹⁾, Ki-Sang Song ⁽²⁾

(1) Computer Education, Korea National University Education, 28173, Republic of Korea

(2) Computer Education, Korea National University Education, 28173, Republic of Korea

Fulltext View | Download

How to cite (IJASEIT) :

[1]

J.-H. Lee and K.-S. Song, “Comparison and Analysis of CNN Models to Improve a Facial Emotion Classification Accuracy for Koreans and East Asians”, Int. J. Adv. Sci. Eng. Inf. Technol., vol. 14, no. 3, pp. 811–817, Jun. 2024.

Citation Format :

Facial emotion recognition is one of the popular tasks in computer vision. Face recognition techniques based on deep learning can provide the best face recognition performance, but using these techniques requires a lot of labeled face data. Available large-scale facial datasets are predominantly Western and contain very few Asians. We found that models trained using these datasets were less accurate at identifying Asians than Westerners. Therefore, to increase the accuracy of Asians' facial identification, we compared and analyzed various CNN models that had been previously studied. We also added Asian faces and face data in realistic situations to the existing dataset and compared the results. As a result of model comparison, VGG16 and Xception models showed high prediction rates for facial emotion recognition in this study. and the more diverse the dataset, the higher the prediction rate. The prediction rate of the East Asian dataset for the model trained on FER2013 was relatively low. However, for data learned with KFE, the model prediction of FER2013 was predicted to be relatively high. However, because the number of East Asian datasets is small, caution is needed in interpretation. Through this study, it was confirmed that large CNN models can be used for facial emotion analysis, but that selection of an appropriate model is essential. In addition, it was confirmed once again that a variety of datasets and the prediction rate increase as a large amount of data is learned.

Ekman, Paul, and Wallace V. Friesen. "Constants across cultures in the face and emotion." Journal of personality and social psychology 17.2, 1971: 124.

Lecun, Yann, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86.11: 2278-2324.

Li, Shan, and Weihong Deng. "Deep facial expression recognition: A survey." IEEE transactions on affective computing 13.3, 2020: 1195-1215.

Li, Yingxin, et al. Boundary-Aware Face Alignment with Enhanced HourglassNet and Transformer. APSIPA Transactions on Signal and Information Processing, 2023, 12.1.

Liao, Jun, et al. Facial expression recognition methods in the wild based on fusion feature of attention mechanism and LBP. Sensors, 2023, 23.9: 4204.

Wu, Meng, et al. Deep learning for image classification: a review. In: International Conference on Medical Imaging and Computer-Aided Diagnosis. Singapore: Springer Nature Singapore, 2023. p. 352-362.

Abdullah, Sharmeen M. Saleem, and Adnan Mohsin Abdulazeez. "Facial expression recognition based on deep learning convolution neural network: A review." Journal of Soft Computing and Data Mining 2.1, 2021: 53-65.

Kanade Takeo, and Jeffrey F. Cohn. "CK+: Open-source software for emotion classification from facial expressions." 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. IEEE, 2010.

Korea Institute for Artificial Intelligence & Society (KIAI). "AIhub."

[Online] Available: https://aihub.or.kr/.

Viola, Paul, and Michael Jones. "Rapid object detection using a boosted cascade of simple features." Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001.

Lyons, Michael, et al. "Coding facial expressions with gabor wavelets." Proceedings Third IEEE international conference on automatic face and gesture recognition. IEEE, 1998.

Chen, Liang-Chieh, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 2017, 40.4: 834-848.

Tan, Mingxing; Le, Quoc. Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, 2019. p. 6105-6114.

He, Tong, et al. Bag of tricks for image classification with convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. p. 558-567.

Cubuk, Ekin D., et al. Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. p. 113-123.

Howard, Andrew, et al. Searching for mobilenetv3. In: Proceedings of the IEEE/CVF international conference on computer vision. 2019. p. 1314-1324.

Tan, Mingxing; Le, Quoc V. Mixconv: Mixed depthwise convolutional kernels. arXiv preprint arXiv:1907.09595, 2019.

Bochkovskiy, Alexey; Wang, Chien-Yao; Liao, Hong-Yuan Mark. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.

Han, Kai, et al. A survey on vision transformer. IEEE transactions on pattern analysis and machine intelligence, 2022, 45.1: 87-110.

Dai, Zihang, et al. Coatnet: Marrying convolution and attention for all data sizes. Advances in neural information processing systems, 2021, 34: 3965-3977.

Chen, Xi, et al. Pali: A jointly-scaled multilingual language-image model. arXiv preprint arXiv:2209.06794, 2022.

Wortsman, Mitchell, et al. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In: International conference on machine learning. PMLR, 2022. p. 23965-23998.

Srivastava, Siddharth; Sharma, Gaurav. Omnivec: Learning robust representations with cross modal sharing. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024. p. 1236-1248.

Baharani, Mohammadreza; Mohan, Shrey; Tabkhi, Hamed. Real-time person re-identification at the edge: A mixed precision approach. In: International Conference on Image Analysis and Recognition. Cham: Springer International Publishing, 2019. p. 27-39.

Duan, Yueqi; Lu, Jiwen; Zhou, Jie. Uniformface: Learning deep equidistributed representation for face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. p. 3415-3424.

Shi, Yichun, et al. Towards universal representation learning for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. p. 6817-6826.

Adityatama, Resta; Putra, Anggyi Trisnawan. Image classification of Human Face Shapes Using Convolutional Neural Network Xception Architecture with Transfer Learning. Recursive Journal of Informatics, 2023, 1.2: 102-109.

Akhtar, Zahid; Mouree, Murshida Rahman; Dasgupta, Dipankar. Utility of deep learning features for facial attributes manipulation detection. In: 2020 IEEE International Conference on Humanized Computing and Communication with Artificial Intelligence (HCCAI). IEEE, 2020. p. 55-60.

Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database." 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009.

Russakovsky, Olga, et al. "Imagenet large scale visual recognition challenge." International journal of computer vision 115. 2015: 211-252.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution LicenseÂ that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (SeeÂ The Effect of Open Access).