Dynamic Sign Language Recognition Using Mediapipe Library and Modified LSTM Method

Ridwang; Amil Ahmad Ilham; Ingrid Nurtanio; Syafaruddin

doi:10.18517/ijaseit.13.6.19401

DOI : https://doi.org/10.18517/ijaseit.13.6.19401

Dynamic Sign Language Recognition Using Mediapipe Library and Modified LSTM Method

Ridwang ⁽¹⁾, Amil Ahmad Ilham ⁽²⁾, Ingrid Nurtanio ⁽³⁾, Syafaruddin ⁽⁴⁾

(1) Department of Electrical Engineering, Universitas Muhammadiyah Makassar, Makassar 90221, Indonesia

(2) Department of Informatics, Universitas Hasanuddin, Gowa, 92171, South Sulawesi, Indonesia

(3) Department of Informatics, Universitas Hasanuddin, Gowa, 92171, South Sulawesi, Indonesia

(4) Department of Electrical Engineering, Universitas Hasanuddin, Gowa,, 92171, South Sulawesi, Indonesia

Fulltext View | Download

How to cite (IJASEIT) :

Ridwang, et al. “Dynamic Sign Language Recognition Using Mediapipe Library and Modified LSTM Method”. International Journal on Advanced Science, Engineering and Information Technology, vol. 13, no. 6, Dec. 2023, pp. 2171-80, doi:10.18517/ijaseit.13.6.19401.

Citation Format :

Hand gesture recognition (HGR) is a primary mode of communication and human involvement. While HGR can be used to enhance user interaction in human-computer interaction (HCI), it can also be used to overcome language barriers. For example, HGR could be used to recognize sign language, which is a visual language expressed by hand movements, poses, and faces, and used as a basic communication mode by deaf people around the world. This research aims to create a new method to detect dynamic hand movements, poses, and faces in sign language translation systems. The Long Short-Term Memory Modification (LSTM) approach and the Mediapipe library are used to recognize dynamic hand movements. In this study, twenty dynamic movements that match the context were designed to solve the challenge of identifying dynamic signal movements. Sequences and image processing data are collected using MediaPipe Holistic, processed, and trained using the LSTM Modification method. This model is practiced using training and validation data and a test set to evaluate it. The training evaluation results using the confusion matrix achieved an average accuracy of twenty words trained, which was 99.4% with epoch 150. The results of experiments per word showed detection accurateness of 85%, while experiments using sentences only reached 80%. The research carried out is a significant step forward in advancing the accuracy and practice of the dynamic sign language recognition system, promising better communication and accessibility for deaf people.

B. Sundar and T. Bagyammal, "American Sign Language Recognition for Alphabets Using MediaPipe and LSTM," Procedia Comput Sci, vol. 215, pp. 642-651, 2022, doi: 10.1016/j.procs.2022.12.066.

Y. SHI, Y. LI, X. FU, M. I. A. O. Kaibin, and M. I. A. O. Qiguang, "Review of dynamic gesture recognition," Virtual Reality and Intelligent Hardware, vol. 3, no. 3. KeAi Communications Co., pp. 183-206, Jun. 01, 2021. doi: 10.1016/j.vrih.2021.05.001.

D. K. Jain, A. Kumar, and S. R. Sangwan, "TANA: The amalgam neural architecture for sarcasm detection in indian indigenous language combining LSTM and SVM with word-emoji embeddings," Pattern Recognit Lett, vol. 160, pp. 11-18, Aug. 2022, doi: 10.1016/J.PATREC.2022.05.026.

Y. S. Tan, K. M. Lim, and C. P. Lee, "Hand gesture recognition via enhanced densely connected convolutional neural network," Expert Syst Appl, vol. 175, Aug. 2021, doi: 10.1016/j.eswa.2021.114797.

P. K. Athira, C. J. Sruthi, and A. Lijiya, "A Signer Independent Sign Language Recognition with Co-articulation Elimination from Live Videos: An Indian Scenario," Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 3, pp. 771-781, Mar. 2022, doi: 10.1016/j.jksuci.2019.05.002.

R. O. Maimon-Mor et al., "Talking with Your (Artificial) Hands: Communicative Hand Gestures as an Implicit Measure of Embodiment," iScience, vol. 23, no. 11, Nov. 2020, doi: 10.1016/j.isci.2020.101650.

V. Adithya and R. Rajesh, "Hand gestures for emergency situations: A video dataset based on words from Indian sign language," Data Brief, vol. 31, Aug. 2020, doi: 10.1016/j.dib.2020.106016.

A. P. G and A. P. k, "Design of an integrated learning approach to assist real-time deaf application using voice recognition system," Computers and Electrical Engineering, vol. 102, p. 108145, Sep. 2022, doi: 10.1016/J.COMPELECENG.2022.108145.

C. Hinchcliffe et al., "Language comprehension in the social brain: Electrophysiological brain signals of social presence effects during syntactic and semantic sentence processing," Cortex, vol. 130, pp. 413-425, Sep. 2020, doi: 10.1016/J.CORTEX.2020.03.029.

M. Suneetha, P. MVD, and K. PVV, "Multi-view motion modelled deep attention networks (M2DA-Net) for video based sign language recognition," J Vis Commun Image Represent, vol. 78, p. 103161, Jul. 2021, doi: 10.1016/J.JVCIR.2021.103161.

K. Sadeddine, Z. F. Chelali, R. Djeradi, A. Djeradi, and S. Ben Abderrahmane, "Recognition of user-dependent and independent static hand gestures: Application to sign language," J Vis Commun Image Represent, vol. 79, p. 103193, Aug. 2021, doi: 10.1016/J.JVCIR.2021.103193.

L. R. Cerna, E. E. Cardenas, D. G. Miranda, D. Menotti, and G. Camara-Chavez, "A multimodal LIBRAS-UFOP Brazilian sign language dataset of minimal pairs using a microsoft Kinect sensor," Expert Syst Appl, vol. 167, p. 114179, Apr. 2021, doi: 10.1016/J.ESWA.2020.114179.

R. Solgi, H. A. Loí¡iciga, and M. Kram, "Long short-term memory neural network (LSTM-NN) for aquifer level time series forecasting using in-situ piezometric observations," J Hydrol (Amst), vol. 601, Oct. 2021, doi: 10.1016/j.jhydrol.2021.126800.

L. Gao, H. Li, Z. Liu, Z. Liu, L. Wan, and W. Feng, "RNN-Transducer based Chinese Sign Language Recognition," Neurocomputing, vol. 434, pp. 45-54, Apr. 2021, doi: 10.1016/J.NEUCOM.2020.12.006.

S. Subburaj and S. Murugavalli, "Survey on sign language recognition in context of vision-based and deep learning," Measurement: Sensors, vol. 23, p. 100385, Oct. 2022, doi: 10.1016/J.MEASEN.2022.100385.

K. Anand, S. Urolagin, and R. K. Mishra, "How does hand gestures in videos impact social media engagement - Insights based on deep learning," International Journal of Information Management Data Insights, vol. 1, no. 2, Nov. 2021, doi: 10.1016/j.jjimei.2021.100036.

R. Solgi, H. A. Loí¡iciga, and M. Kram, "Long short-term memory neural network (LSTM-NN) for aquifer level time series forecasting using in-situ piezometric observations," J Hydrol (Amst), vol. 601, Oct. 2021, doi: 10.1016/j.jhydrol.2021.126800.

P. K. Athira, C. J. Sruthi, and A. Lijiya, "A Signer Independent Sign Language Recognition with Co-articulation Elimination from Live Videos: An Indian Scenario," Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 3, pp. 771-781, Mar. 2022, doi: 10.1016/j.jksuci.2019.05.002.

G. A. Rao and P. V. V. Kishore, "Selfie video based continuous Indian sign language recognition system," Ain Shams Engineering Journal, vol. 9, no. 4, pp. 1929-1939, Dec. 2019, doi: 10.1016/j.asej.2016.10.013.

S. K. Devi and S. CN, "Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images," Int J Adv Sci Eng Inf Technol, vol. 12, no. 3, pp. 1263-1268, 2022, Accessed: Nov. 07, 2023. [Online]. Available: http://dx.doi.org/10.18517/ijaseit.12.3.15771

R. Sreemathy, M. P. Turuk, S. Chaudhary, K. Lavate, A. Ushire, and S. Khurana, "Continuous word level sign language recognition using an expert system based on machine learning," International Journal of Cognitive Computing in Engineering, vol. 4, pp. 170-178, Jun. 2023, doi: 10.1016/J.IJCCE.2023.04.002.

M. A. Almasre and H. Al-Nuaim, "A comparison of Arabic sign language dynamic gesture recognition models," Heliyon, vol. 6, no. 3, p. e03554, Mar. 2020, doi: 10.1016/J.HELIYON.2020.E03554.

Y. S. Tan, K. M. Lim, and C. P. Lee, "Hand gesture recognition via enhanced densely connected convolutional neural network," Expert Syst Appl, vol. 175, Aug. 2021, doi: 10.1016/j.eswa.2021.114797.

R. Gupta and A. Kumar, "Indian sign language recognition using wearable sensors and multi-label classification," Computers & Electrical Engineering, vol. 90, p. 106898, Mar. 2021, doi: 10.1016/J.COMPELECENG.2020.106898.

J. Bora, S. Dehingia, A. Boruah, A. A. Chetia, and D. Gogoi, "Real-time Assamese Sign Language Recognition using MediaPipe and Deep Learning," Procedia Comput Sci, vol. 218, pp. 1384-1393, Jan. 2023, doi: 10.1016/J.PROCS.2023.01.117.

I. A. Putra, O. D. Nurhayati, and D. Eridani, “Human Action Recognition (HAR) Classification Using MediaPipe and Long Short-Term Memory (LSTM),” TEKNIK, vol. 43, no. 2, pp. 190-201, Aug. 2022, doi: 10.14710/teknik.v43i2.46439.

B. Subramanian, B. Olimov, S. M. Naik, S. Kim, K. H. Park, and J. Kim, "An integrated mediapipe-optimized GRU. model for Indian sign language recognition," Sci Rep, vol. 12, no. 1, Dec. 2022, doi: 10.1038/s41598-022-15998-7.

R. Ridwang, I. Nurtanio, A. Ahmad Ilham, and S. Syafaruddin, "Deaf Sign Language Translation System with Pose and Hand Gesture Detection Under LSTM-Sequence Classification Model," ICIC Express Letters, vol. 17, no. 7, pp. 809-816, 2023, doi: 10.24507/icicel.17.07.809.

Q. Xiao, X. Chang, X. Zhang, and X. Liu, "Multi-Information Spatial-Temporal LSTM Fusion Continuous Sign Language Neural Machine Translation," IEEE Access, vol. 8, pp. 216718-216728, 2020, doi: 10.1109/ACCESS.2020.3039539.

B. Verma, "A two stream convolutional neural network with bi-directional GRU. model to classify dynamic hand gesture," J Vis Commun Image Represent, vol. 87, p. 103554, Aug. 2022, doi: 10.1016/J.JVCIR.2022.103554.

Q. Xiao, X. Chang, X. Zhang, and X. Liu, "Multi-Information Spatial-Temporal LSTM Fusion Continuous Sign Language Neural Machine Translation," IEEE Access, vol. 8, pp. 216718-216728, 2020, doi: 10.1109/ACCESS.2020.3039539.

A. S. Agrawal, A. Chakraborty, and C. M. Rajalakshmi, "Real-Time Hand Gesture Recognition System Using MediaPipe and LSTM," International Journal of Research Publication and Reviews, vol. 3, pp. 2509-2515, 2022, [Online]. Available: www.ijrpr.com

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution LicenseÂ that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (SeeÂ The Effect of Open Access).