Establishment of a Real-Time Risk Assessment and Preventive Safety Management System in Industrial Environments Utilizing Multimodal Data and Advanced Deep Reinforcement Learning Techniques

Hyun Sim (1), Hyunwook Kim (2)
(1) Department of Smart Agriculture, Sunchon National University, Republic of Korea
(2) Kornerstone. Co,Ltd., Suncheon City, Republic of Korea
Fulltext View | Download
How to cite (IJASEIT) :
[1]
H. Sim and H. Kim, “Establishment of a Real-Time Risk Assessment and Preventive Safety Management System in Industrial Environments Utilizing Multimodal Data and Advanced Deep Reinforcement Learning Techniques”, Int. J. Adv. Sci. Eng. Inf. Technol., vol. 15, no. 1, pp. 328–337, Feb. 2025.
This study proposes a new paradigm for real-time, predictive, and multidimensional risk assessment in industrial environments by leveraging multimodal data (video, audio, environmental sensors). Existing risk assessment systems typically rely on single data sources or subjective judgment, making it difficult to adapt swiftly to complex workplace changes and achieve real-time responsiveness. To address these limitations, we constructed a large-scale multimodal dataset of approximately 10TB, employing real-time streaming-based preprocessing and data synchronization. This approach integrates various data sources—high-resolution cameras, high-sensitivity microphones, and environmental sensors (temperature, humidity, vibration)—and applies data augmentation techniques such as AutoAugment and MixUp to build robust models capable of handling diverse environmental conditions. We adopted a hybrid analytical algorithm combining Vision Transformer (ViT) and YOLOv8, achieving high accuracy (over 95%) and real-time processing (average response time under one second). Additionally, we utilized machine learning algorithms such as SVM, Random Forest, and K-Means to detect anomalies in audio and environmental sensor data, thus identifying latent risk factors.  Experimental results demonstrate multifaceted performance improvements compared to conventional approaches, including over a 15% increase in accuracy, approximately 30% reduction in response time, about 20% reduction in power consumption, and user satisfaction exceeding 90%. These achievements were verified across various industrial settings—chemical, manufacturing, logistics—highlighting the system’s capacity to detect complex risk factors and respond proactively.  By seamlessly integrating multimodal data analysis, state-of-the-art deep learning models (ViT, YOLOv8), and reinforcement learning-based response strategies, we have demonstrated a transition from traditional, static, and retrospective risk assessment to an intelligent, real-time, and predictive safety management framework.

L. Bergman and Y. Hoshen, “Classification-based anomaly detection for general data,” in Int. Conf. Learn. Represent. (ICLR), 2020. doi: 10.48550/arxiv.2005.02359.

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. 37th Int. Conf. Mach. Learn. (ICML), vol. 119, H. Daumé III and A. Singh, Eds., PMLR, 2020, pp. 1597–1607. doi: 10.48550/arxiv.2002.05709.

X. Chen, H. Fan, R. Girshick, and K. He, “Improved baselines with momentum contrastive learning,” arXiv, 2020. doi: 10.48550/arxiv.2003.04297.

N. Cohen and Y. Hoshen, “Sub-image anomaly detection with deep pyramid correspondences,” arXiv, 2020. doi: 10.48550/arxiv.2005.02357.

A. Dosovitskiy et al., “An image is worth 16×16 words: Transformers for image recognition at scale,” in Int. Conf. Learn. Represent. (ICLR), 2021. doi: 10.48550/arxiv.2010.11929.

S. Fort, J. Ren, and B. Lakshminarayanan, “Exploring the limits of out-of-distribution detection,” arXiv, 2021. doi: 10.48550/arxiv.2106.03004.

D. Hendrycks, M. Mazeika, S. Kadavath, and D. Song, “Using self-supervised learning can improve model robustness and uncertainty,” arXiv, 2019. doi: 10.48550/arxiv.1906.12340.

X. Ji, J. F. Henriques, and A. Vedaldi, “Invariant information clustering for unsupervised image classification and segmentation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2019, pp. 9865–9874. doi: 10.1109/iccv.2019.00996.

A. Kolesnikov et al., “Big Transfer (BiT): General visual representation learning,” in Comput. Vis. – ECCV 2020, vol. 12346, A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, Eds., Cham: Springer, 2020, pp. 491–507. doi: 10.1007/978-3-030-58565-5_29.

C.-L. Li, K. Sohn, J. Yoon, and T. Pfister, “CutPaste: Self-supervised learning for anomaly detection and localization,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 9664–9674. doi: 10.1109/cvpr46437.2021.00955.

M. Matena and C. Raffel, “Merging models with fisher-weighted averaging,” arXiv, 2021. doi: 10.48550/arxiv.2111.09832.

P. Perera and V. M. Patel, “Learning deep features for one-class classification,” IEEE Trans. Image Process., vol. 28, no. 11, pp. 5450–5463, 2019. doi: 10.1109/tip.2019.2916751.

A. Radford et al., “Learning transferable visual models from natural language supervision,” in Proc. 38th Int. Conf. Mach. Learn. (ICML), vol. 139, PMLR, 2021, pp. 8748–8763. doi: 10.48550/arxiv.2103.00020.

T. Reiss, N. Cohen, L. Bergman, and Y. Hoshen, “PANDA: Adapting pretrained features for anomaly detection and segmentation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 2806–2814. doi: 10.1109/cvpr46437.2021.00281.

T. Reiss and Y. Hoshen, “Mean-shifted contrastive loss for anomaly detection,” arXiv, 2021. doi: 10.48550/arxiv.2106.03844.

J. Ren et al., “Likelihood ratios for out-of-distribution detection,” arXiv, 2019. doi: 10.48550/arxiv.1906.02845.

O. Rippel, A. Chavan, C. Lei, and D. Merhof, “Transfer learning Gaussian anomaly detection by fine-tuning representations,” arXiv, 2021. doi: 10.48550/arxiv.2108.04116.

M. Ronen, S. E. Finder, and O. Freifeld, “DeepDPM: Deep clustering with an unknown number of clusters,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 9861–9870. doi: 10.1109/cvpr52688.2022.00966.

V. Sehwag, M. Chiang, and P. Mittal, “SSD: A unified framework for self-supervised outlier detection,” arXiv, 2021. doi: 10.48550/arxiv.2103.12051.

J. Serrà et al., “Input complexity and out-of-distribution detection with likelihood-based generative models,” arXiv, 2019. doi: 10.48550/arxiv.1909.11480.

J. Tack, S. Mo, J. Jeong, and J. Shin, “CSI: Novelty detection via contrastive learning on distributionally shifted instances,” Adv. Neural Inf. Process. Syst., vol. 33, pp. 11839–11852, 2020. doi: 10.48550/arxiv.2007.08176.

W. Van Gansbeke et al., “SCAN: Learning to classify images without labels,” in Eur. Conf. Comput. Vis. (ECCV), vol. 12374, A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, Eds., Cham: Springer, 2020, pp. 268–285. doi: 10.1007/978-3-030-58580-8_16.

H. Wei, R. Xie, H. Cheng, L. Feng, B. An, and Y. Li, “Mitigating neural network overconfidence with logit normalization,” arXiv, 2022. doi: 10.48550/arxiv.2205.09310.

A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv, 2020. doi: 10.48550/arxiv.2004.10934.

Z. Ge et al., “YOLOX: Exceeding YOLO series in 2021,” arXiv, 2021. doi: 10.48550/arxiv.2107.08430.

S. Sridharan et al., “Neural memory plasticity for medical anomaly detection,” Neural Netw., vol. 127, pp. 67–81, 2020. doi: 10.1016/j.neunet.2020.04.011.

Y. F. A. Gaus et al., “Evaluation of a dual convolutional neural network architecture for object-wise anomaly detection,” in 2019 Int. Joint Conf. Neural Netw. (IJCNN), IEEE, 2019. doi: 10.1109/ijcnn.2019.8851829.

P. Bergmann et al., “Student-teacher anomaly detection with discriminative latent embeddings,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020. doi: 10.48550/arxiv.1911.02357.

G. Di Biase et al., “Pixel-wise anomaly detection in complex driving scenes,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021. doi: 10.48550/arxiv.2103.05445.

N. Cohen et al., “Out-of-distribution detection without class labels,” arXiv, 2021. doi: 10.48550/arxiv.2112.07662.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

    1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
    2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
    3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).