Lightweight Fuss-Free Network-Based Crowd Counting Model Using Knowledge Distillation

Chuho Yi; Jungwon Cho

doi:10.18517/ijaseit.15.3.13021

DOI : https://doi.org/10.18517/ijaseit.15.3.13021

Lightweight Fuss-Free Network-Based Crowd Counting Model Using Knowledge Distillation

Chuho Yi ⁽¹⁾, Jungwon Cho ⁽²⁾

(1) Department of AI Convergence, Hanyang Women's University, Seoul, Republic of Korea

(2) Department of Computer Education, Jeju National University, Jeju, Republic of Korea

Fulltext View | Download

How to cite (IJASEIT) :

[1]

C. Yi and J. Cho, “Lightweight Fuss-Free Network-Based Crowd Counting Model Using Knowledge Distillation”, Int. J. Adv. Sci. Eng. Inf. Technol., vol. 15, no. 3, pp. 1007–1012, Jun. 2025.

Citation Format :

This paper presents FFNet-S, a lightweight crowd counting model built on the simple and efficient architecture of FFNet, but enhanced via knowledge distillation (KD). The student model employs MobileNetV3 as the backbone with preservation of the multi-scale feature fusion structure of FFNet. To guide the student effectively, a composite distillation loss is introduced. This combines soft target regression, intermediate feature alignment, and attention transfer. A two-stage training strategy is adopted. Initial training on the ground truth ensures stable convergence. Next, gradual incorporation of distillation losses enhance performance. Experiments on benchmark datasets, including the ShanghaiTech Part A (SHA) and Part B (SHB), show that FFNet-S is over 90% smaller than the teacher model, but the accuracy is comparable. Moreover, FFNet-S makes inferences in real time, rendering it suitable for deployment on edge devices with limited computational resources. The proposed approach shows that a carefully designed KD framework enables compact models to exhibit the capacities of larger more complex networks without a significant loss of accuracy. Balancing of speed, accuracy, and efficiency renders FFNet-S very applicable in real-world scenarios such as surveillance systems, drones, and Internet of Things platforms. We present a practical and scalable solution for efficient crowd counting. This encourages further exploration of lightweight models for computer vision tasks when resources are constrained.

B. Li, X. Zhang, X. Li, and H. Lu, "Approaches on crowd counting and density estimation: A review," Pattern Anal. Appl., vol. 24, pp. 853–874, 2021, doi: 10.1007/s10044-021-00959-z.

L. Chen, X. Gao, Y. Liu, and J. Wang, "The effectiveness of a simplified model structure for crowd counting," IEEE Trans. Instrum. Meas., early access, 2025, doi: 10.1109/tim.2025.3554288.

G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," arXiv, Mar. 2015, doi: 10.48550/arXiv.1503.02531.

S. Umirzakova et al., "Simplified knowledge distillation for deep neural networks bridging the performance gap with a novel teacher-student architecture," Electronics, vol. 13, no. 22, Art. no. 4530, 2024, doi: 10.3390/electronics13224530.

H.-B. Bak and S.-H. Bae, "Knowledge distillation based on internal/external correlation learning," J. Korean Soc. Comput. Inf., vol. 28, no. 4, pp. 31–39, 2023.

D. Lee, "Knowledge distillation for recommender systems," in Proc. Korean Inst. Inf. Sci. Eng. Conf. (KIISE), Seoul, South Korea, 2021, pp. 48–52.

R. Wang, Y. Zhao, T. Huang, and W. Zhang, "Efficient crowd counting via dual knowledge distillation," IEEE Trans. Image Process., vol. 33, pp. 569–583, 2023, doi: 10.1109/tip.2023.3343609.

Q. Wang, J. Gao, W. Lin, and Y. Yuan, "Learning from synthetic data for crowd counting in the wild," in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Long Beach, CA, USA, 2019, pp. 8190–8199, doi: 10.1109/cvpr.2019.00839.

Z. Huo et al., "Domain adaptive crowd counting via dynamic scale aggregation network," IET Comput. Vis., vol. 17, no. 7, pp. 814–828, 2023, doi: 10.1049/cvi2.12198.

P. Chen et al., "Distilling knowledge via knowledge review," in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 5008–5017, doi: 10.1109/cvpr46437.2021.00497.

A. N. Alhawsawi, S. D. Khan, and F. U. Rehman, "Crowd counting in diverse environments using a deep routing mechanism informed by crowd density levels," Information, vol. 15, no. 5, Art. no. 275, 2024, doi: 10.3390/info15050275.

V. Sindagi and V. Patel, "A survey of recent advances in CNN-based single image crowd counting and density estimation," Pattern Recognit. Lett., vol. 107, pp. 3–16, 2018, doi: 10.1016/j.patrec.2017.07.007.

G. Gao et al., "A survey of deep learning methods for density estimation and crowd counting," Vicinagearth, vol. 2, no. 1, Art. no. 2, 2025, doi: 10.1007/s44336-024-00011-8.

L. Chen, G. Wang, and G. Hou, "Multi-scale and multi-column convolutional neural network for crowd density estimation," Multimedia Tools Appl., vol. 80, pp. 6661–6674, 2021, doi: 10.1007/s11042-020-10002-8.

Y. Zhang et al., "Congested crowd counting via adaptive multi-scale context learning," Sensors, vol. 21, no. 11, Art. no. 3777, 2021, doi: 10.3390/s21113777.

F. Zhu et al., "Real-time crowd counting via lightweight scale-aware network," Neurocomputing, vol. 472, pp. 54–67, 2022, doi: 10.1016/j.neucom.2021.11.099.

V. A. Sindagi and V. M. Patel, "Generating high-quality crowd density maps using contextual pyramid CNNs," in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Venice, Italy, 2017, pp. 1879–1888, doi: 10.1109/iccv.2017.206.

C.-C. Lien and P.-C. Wu, "A crowded object counting system with self-attention mechanism," Sensors, vol. 24, no. 20, Art. no. 6612, 2024, doi: 10.3390/s24206612.

T. Ye, X. Chu, and H. Wang, "CCTrans: Simplifying and improving crowd counting with transformer," arXiv, Sep. 2021. [Online]. Available: https://arxiv.org/abs/2109.14483.

C. Haldız et al., "Crowd counting via joint SASNet and a guided batch normalization network," in Proc. Signal Process. Commun. Appl. Conf. (SIU), Istanbul, Türkiye, 2023, pp. 1–4, doi: 10.1109/SIU59756.2023.10223901.

M. Sandler et al., "MobilenetV2: Inverted residuals and linear bottlenecks," in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Salt Lake City, UT, USA, 2018, pp. 4510–4520, doi: 10.1109/cvpr.2018.00474.

L. Zhao et al., "A lightweight deep neural network with higher accuracy," PLoS ONE, vol. 17, no. 8, Art. no. e0271225, 2022, doi: 10.1371/journal.pone.0271225.

Y. Wang et al., "Eccnas: Efficient crowd counting neural architecture search," ACM Trans. Multimedia Comput. Commun. Appl., vol. 18, no. 1, pp. 1–19, 2022, doi: 10.1145/3465455.

R. Murata et al., "Recurrent neural network-FitNets: Improving early prediction of student performance by time-series knowledge distillation," J. Educ. Comput. Res., vol. 61, no. 3, pp. 639–670, 2023, doi: 10.1177/07356331221129765.

S. Zagoruyko and N. Komodakis, "Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer," arXiv, Dec. 2016. [Online]. Available: https://arxiv.org/abs/1612.03928

H. Liu, Y. Zhang, and Y. Chen, "A symmetric efficient spatial and channel attention (ESCA) module based on convolutional neural networks," Symmetry, vol. 16, no. 8, Art. no. 952, 2024, doi: 10.3390/sym16080952.

Y. Liu, K. Chen, C. Liu, Z. Qin, Z. Luo, and J. Wang, “Structured knowledge distillation for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 2604–2613.

Z. Li et al., "Dual teachers for self-knowledge distillation," Pattern Recognit., vol. 151, Art. no. 110422, 2024, doi: 10.1016/j.patcog.2024.110422.

L. Huang et al., "Context-aware multi-scale aggregation network for congested crowd counting," Sensors, vol. 22, no. 9, Art. no. 3233, 2022, doi: 10.3390/s22093233.

S. Woo et al., "ConvNeXt v2: Co-designing and scaling ConvNets with masked autoencoders," in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2023, pp. 16133-16142, doi:10.1109/cvpr52729.2023.01548.

Y. Zhang et al., "Single-image crowd counting via multi-column convolutional neural network," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, 2016, pp. 589–597, doi: 10.1109/cvpr.2016.70.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution LicenseÂ that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (SeeÂ The Effect of Open Access).