SADY: Student Activity Detection Using YOLO-based Deep Learning Approach

Anagha Deshpande (1), Krishna Warhade (2)
(1) School of Electronics and Communication Engineering, Dr. Vishwanath Karad MIT World Peace University, Pune-411038, Maharashtra, India
(2) School of Electronics and Communication Engineering, Dr. Vishwanath Karad MIT World Peace University, Pune-411038, Maharashtra, India
Fulltext View | Download
How to cite (IJASEIT) :
Deshpande, Anagha, and Krishna Warhade. “SADY: Student Activity Detection Using YOLO-Based Deep Learning Approach”. International Journal on Advanced Science, Engineering and Information Technology, vol. 13, no. 4, July 2023, pp. 1501-9, doi:10.18517/ijaseit.13.4.18393.
Automating human activity recognition is one of computer vision's most appealing and pragmatic research areas. In this article, we have addressed the problem of video-based student activity detection. The student’s activity detection using YOLO (SADY) aims to recognize the normal and abnormal student activities to ensure immediate intervention in case of any risk or necessity. We created our classroom data set of around 220 recordings depicting seven student classroom activities. The YOLOv4 Tiny model was retrained using 5000 labeled keyframes extracted from the train videos. The model was then tested for single or multiple activity detections. We presented the evaluated results for various values of hyperparameters like confidence threshold and Intersection Over Union (IoU) thresholds for the proposed model. The model assigns a unique confidence score and action label to each frame for the test videos by positioning recurrent activity labels. The proposed approach achieved a mean average precision (mAP) of 95% and a frame per second rate (FPS) of 45 for the student activity Class Room (CR) dataset and mAP of 95.18 % for the LIRIS dataset. The experimental findings using the Class Room recorded and LIRIS publicly accessible dataset show that our proposed approach outperforms existing approaches regarding recognition accuracy and speed. The comparable results obtained in this research work imply that the proposed framework could effectively monitor student’s activities in schools, colleges, and universities.

F. Hidayat, F. Hamami, I. A. Dahlan, S. H. Supangkat, A. Fadillah, and A. Hidayatuloh, "Real Time Video Analytics Based on Deep Learning and Big Data for Smart Station", Journal of Physics: Conference Series, vol. 1577, no. 1, July 2020, doi:10.1088/1742-6596/1577/1/012019.

H. Amanullah, S. Letchmunan, M Zia, U. Butt, H. Fadratul, “Analysis of Deep Neural Networks for Human Activity Recognition in Videos - A Systematic Literature Review”, IEEE Access, vol. 99, pp 1-1, 2021, doi: 10.1109/ACCESS.2021.3110610.

R. Mondal, D. Mukherjee, P. K. Singh, V. Bhateja and R. Sarkar, "A New Framework for Smartphone Sensor-Based Human Activity Recognition Using Graph Neural Network," in IEEE Sensors Journal, vol. 21, no. 10, pp. 11461-11468, 15 May, 2021, doi: 10.1109/JSEN.2020.3015726.

M. Bendali-Braham, J. Weber, G. Forestier, Lhassane Idoumghar, P Alain Muller, “Recent trends in crowd analysis: A review”, Machine Learning with Applications, vol. 4, June 2021, 100023, ISSN 2666-8270, doi:10.1016/ jmlwa.2021.100023.

M. R. Bhuiyan., J. Abdullah, N. Hashim, F. Farid, “Video analytics using deep learning for crowd analysis: a review”, Journal Multimedia Tools Applications, vol. 81, pp. 27895-27922, March 2022, doi:10.1007/s11042-022-12833.

S. Bhalla, K. Singh, "Exploration of Crime Detection Using Deep Learning", Innovations in Cyber-Physical Systems. Lecture Notes in Electrical Engineering, vol. 788, pp. 297-304, September 2021.

A. Hayat, F. Morgado-Dias, B.P. Bhuyan, R. Tomar, “Human Activity Recognition for Elderly People Using Machine and Deep Learning Approaches”, MDPI Journal Information, vol. 13, issue 6, pp. 275, 2022 doi:10.3390/info13060275.

C. Jobanputra, J. Bavishi, N. Doshi, “Human Activity Recognition: A Survey, Procedia Computer Science, vol. 155, pp. 698-703, 2019, ISSN 1877-0509, doi: 10.1016/j.procs.2019.08.100.

A. M. F and S. Singh, "Computer Vision-based Survey on Human Activity Recognition System, Challenges and Applications," Proc 3rd International Conference on Signal Processing and Communication (ICPSC), pp. 110-114, 2021, doi: 10.1109/ICSPC51351.2021.9451736.

S. S. Yadav, S.M. Jadhav, “Deep convolutional neural network based medical image classification for disease diagnosis”, Journal of Big Data, Vol.6, 113, December 2019, doi:10.1186/s40537-019-0276-2.

A. Ullah, M. Khan, W. Ding, V. Palade, Ijaz Ul Haq, S. W. Baik, “Efficient activity recognition using lightweight CNN and DS-GRU network for surveillance applications,” Applied Soft Computing, vol. 103, 107102, May 2021, 21.107102.

C. Wolf, J. Mille, E. Lombardi, O. Celiktutan, M. Jiu, E. Dogan, G. Eren, M. Baccouche, E. Dellandrea, C. E. Bichot, C. Garcia, B. Sankur, “Evaluation of video activity localizations integrating quality and quantity measurements”, Computer Vision and Image Understanding, vol. 127, pp.14-30, October 2014.

B. Jagadeesh, & C M Patil, “Video Based Human Activity Detection, Recognition and Classification of actions using SVM”, Transactions on Machine Learning and Artificial Intelligence, vol. 6, no. 6, January 2019.

A. Deshpande, K. K. Warhade, “An Improved Model for Human Activity Recognition by Integrated feature Approach and Optimized SVM”, Proc. International Conference on Emerging Smart Computing and Informatics (ESCI), April 2021, pp. 571-576.

A. Agarwal, A. Sharma, A. Gupta, V. Goel, “Human Movement Recognition System using R”, International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249-8958, vol. 8, Issue 5, pp. 560-566, June 2019.

M. F Abdul, S. Singh, “Computer Vision-based Survey on Human Activity Recognition System”, Challenges and Applications, Proc 3rd International Conference on Signal Processing and Communication, 2021, pp.110-114.

D. R Beddiar., Nini B., Sabokrou M., “Vision-based human activity recognition: A survey”, Multimedia Tools and Applications, vol. 79, pp. 30509-30555, August 2020,

H-B Zhang, Y-X Zhang, B. Zhong, Qing Lei, L. Yang, Ji-Xiang Du, and D-S Chen. “A Comprehensive Survey of Vision-Based Human Action Recognition Methods”, Sensors, vol. 19 no. 5, 1005, February 2019.

Sarnaik, Neha, “Human Activity Recognition using CNN”, International Journal of Scientific and Research Publications (IJSRP), vol 10, issue 2, February 2020, pp 9804, doi:10.29322/IJSRP.10.02.2020.

N. Junagade and S. Kulkarni, "Human Activity Identification using CNN," Proc Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics, and Cloud) (I-SMAC), 2020, pp. 1058-1062, doi: 10.1109/I-SMAC49090.2020.9243477.

K. Wang, Xuejing Li, Jianhua Yang, Jun Wu, Ruifeng Li, "Temporal action detection based on two-stream You Only Look Once network for elderly care service robot "International Journal of Advanced Robotic Systems, vol 18, issue 4, July 2021.

N. Almaadeed, O. Elharrouss, S. Al-Maadeed, A. Bouridane, A. Beghdadi “A novel approach for robust multi-human action recognition and summarization based on 3D convolutional neural networks”, March 2021, doi:10.48550/arXiv.1907.11272.

H. Hamdy Ali, H. M. Moftah, A. Youssif, “Depth-based human activity recognition: A comparative perspective study on feature extraction”, Future Computing and Informatics Journal, vol. 3, issue 1, pp 51-68, 2018,

C. Liu, Y. J. Yang, H. Haima, Yang X., Hu J. L., "Improved human action recognition approach based on two-stream convolutional neural network model", The Visual Computer, vol. 37, pp. 1327-1341, June 2021.

D. Arifoglu, A. Bouchachia, “Activity recognition and abnormal behavior detection with recurrent neural networks”, Procedia Computer Science, vol. 110, pp.86-93, 2017.

S. Chakraborty, R. Mondal, P. K. Singh, R. Sarkar, and D. Bhattacharjee, “Transfer learning with fine tuning for human action recognition from still images”, Multimedia Tools Applications 80, vol. 13, pp. 20547-20578, May 2021, doi:10.1007/s11042-021-10753-y.

Oh S., Ashiquzzama A., Lee D., Kim Y., Kim J., “Study on Human Activity Recognition Using Semi-Supervised Active Transfer Learning”, Sensors, Basel, Switzerland, vol. 21, no. 8, April 2021, doi:10.3390/s21082760.

P. M. Jadhav, S. Begampure, “Intelligent video analytics for human action detection: a deep learning approach with transfer learning”, International Journal of Computing and Digital Systems, vol .11 no.1, pp. 64-71, July 2021, doi:10.12785/ijcds/110105.

S. Shinde, A. Kothari, G. V, "YOLO based human action recognition and localization", Procedia Computer Sci, vol. 133, pp. 831-838, 2018.

J. Zicong, L. Zhao, S. Li, and Y. Jia, “Real-time object detection method based on improved YOLOv4-tiny” Journal of Network Intelligence, vol. 7, no.1, February 2022, doi:10.48550/arXiv.2011.04244 .

C. Schuldt, I. Laptev, and B. Caputo, "Recognizing human actions: a local SVM approach," Proceedings of the 17th International Conference on Pattern Recognition 2004, September 2004, vol.3, pp. 32-36, doi: 10.1109/ICPR.2004.1334462.

L. Zelnik-Manor, &M. Irani, “Event-based analysis of video”, Proc IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, December 2001, vol. 2, pp. II-II.

K. Soomro, A.R Zamir, and M. Shah, “UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild” CRCV-TR-12-01,2012, abs/1212.0402. doi:

J. Liu, J. Luo, and M. Shah, “Recognizing Realistic Actions from Videos in the Wild", IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), August 2009, pp. 1996-2003, doi: 10.1109/CVPR.2009.5206744.

G. Chunhui, C. Sun, D. A. Ross, C. Vondrick, C. Pantofaru, et al. "Ava: A video dataset of spatiotemporal localized atomic visual actions", Proceedings IEEE Conference on Computer Vision and Pattern Recognition, December 2018, pp 6047-6056, doi: 10.1109/CVPR.2018.00633.

P. Barmpoutis, T. Stathaki, and S. Camarinopoulos, "Skeleton-Based Human Action Recognition through Third-Order Tensor Representation and Spatio-Temporal Analysis", Inventions, vol. 4, no.9, February 2019.

H. Hendry, Rung-Ching Chen, “Automatic License Plate Recognition via sliding-window darknet- YOLO deep learning”, Image and Vision Computing, vol. 87, pp. 47-56, ISSN 0262-885, July 2019, doi: 10.1016/j.imavis.2019.04.007.

A. Bochkovskiy, C. Y. Wang, H. Yuan, M. Liao., “YOLOv4: optimal speed and accuracy of object detection”, April 2020, DOI:

G. Malik, Muhammad H., Yousaf, Shah Nawaz, Zakaur Rehman, Hyung Won Kim, “Patient Monitoring by Abnormal Human Activity Recognition Based on CNN Architecture”, Electronics, vol 9, no. 12, November 2020, doi:10.3390/electronics9121993

W. Mmereki, R. S. Jamisola, D, T. Mpoeleng, Petso, "YOLOv3-Based Human Activity Recognition as Viewed from a Moving High-Altitude Aerial Camera", Proc 7th International Conference on Automation, Robotics and Applications (ICARA), Feb 2021, pp. 241-246

Authors who publish with this journal agree to the following terms:

    1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
    2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
    3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).