Phishing Domain Detection Using Machine Learning Algorithms

Dinny Komalasari; Tri Basuki Kurniawan; Deshinta Arrova Dewi; Mohd Zaki Zakaria; Zubaile Abdullah; Alde Alanda

doi:10.18517/ijaseit.15.1.12553

DOI : https://doi.org/10.18517/ijaseit.15.1.12553

Phishing Domain Detection Using Machine Learning Algorithms

Dinny Komalasari ⁽¹⁾, Tri Basuki Kurniawan ⁽²⁾, Deshinta Arrova Dewi ⁽³⁾, Mohd Zaki Zakaria ⁽⁴⁾, Zubaile Abdullah ⁽⁵⁾, Alde Alanda ⁽⁶⁾

(1) Faculty of Vocasional, Universitas Bina Darma, Palembang, Indonesia

(2) Postgraduate Program, Universitas Bina Darma, Palembang, Indonesia

(3) Faculty of Data Science and Information Technology, INTI International University, Nilai, Malaysia

(4) Faculty of Computer & Mathematic Sciences, University Technology Mara, Malaysia

(5) Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Johor, Malaysia

(6) Department of Information Technology, Politeknik Negeri Padang, Padang, Indonesia

Fulltext View | Download

How to cite (IJASEIT) :

[1]

D. Komalasari, T. B. Kurniawan, D. A. Dewi, M. Z. Zakaria, Z. Abdullah, and A. Alanda, “Phishing Domain Detection Using Machine Learning Algorithms”, Int. J. Adv. Sci. Eng. Inf. Technol., vol. 15, no. 1, pp. 318–327, Feb. 2025.

Citation Format :

Phishing, a prevalent cyber threat, continues to jeopardize sensitive information by exploiting the vulnerabilities of digital platforms. This research investigates the escalating danger of phishing attacks, focusing on the creation of deceptive websites known as phishing domains. Leveraging machine learning algorithms, particularly supervised and unsupervised learning techniques, the study aims to proactively identify and classify these malicious domains by analyzing diverse factors like domain names, online content, SSL certificates, and historical data. The proposed solution involves the development of prediction models using decision trees, random forests, support vector machines, and Gradient Boosting, with the latter exhibiting the highest accuracy at 92%. The system assigns risk scores to domains based on properties such as registration details and SSL certificate validity, facilitating the real-time identification of potential phishing activities. The research addresses the critical need for data security in the face of phishing threats affecting individuals and businesses, providing a robust defense mechanism against evolving cyber threats. Recommendations for continuous model training, regular updates, diversification of dataset sources, and integration with existing security infrastructure aim to enhance the system's adaptability and resilience in countering emerging phishing threats. Overall, this study contributes to ongoing efforts in cybersecurity, offering a proactive defense mechanism against the pervasive and evolving challenges posed by phishing attacks.

C. Pascariu and I. C. Bacivarov, “Detecting phishing websites through domain and content analysis,” in Proc. 13th Int. Conf. Electron., Comput. Artif. Intell. (ECAI), 2021. doi: 10.1109/ecai52376.2021.9515165.

P. Bhatt, M. S. Obaidat, G. Dangwal, A. K. Das, M. Wazid, and B. Sadoun, “Machine learning-based security mechanism for detecting phishing attacks,” in 2024 Int. Conf. Commun., Comput., Cybersecurity, Inform. (CCCI), Oct. 2024, pp. 1–6. doi: 10.1109/ccci61916.2024.10736460.

M. Chatterjee and A. S. Namin, “Detecting phishing websites through deep reinforcement learning,” in Proc. Int. Comput. Softw. Appl. Conf. (COMPSAC), 2019. doi: 10.1109/compsac.2019.10211.

S. Ahmad, M. A. Haque, H. A. M. Abdeljaber, M. U. Bokhari, J. Nazeer, and B. K. Mishra, “Phishing website detection: A dataset-centric approach for enhanced security,” Data Metadata, vol. 3, Dec. 2024. doi: 10.56294/dm2024.223.

Y. Li, Z. Yang, X. Chen, H. Yuan, and W. Liu, “A stacking model using URL and HTML features for phishing webpage detection,” Future Gener. Comput. Syst., vol. 94, 2019. doi: 10.1016/j.future.2018.11.004.

T. N. S. Charishma, A. S. Koushik, G. S. A. Reddy, and M. HimaBindu, “Employing machine learning algorithms to detect phishing URL websites,” in 2024 Int. Conf. IoT Based Control Netw. Intell. Syst. (ICICNIS), Dec. 2024, pp. 1553–1558. doi: 10.1109/icicnis64247.2024.10823220.

Y. H. Jazyah and L. Al Shalabi, “Phishing detection using clustering and machine learning,” IAES Int. J. Artif. Intell. (IJ-AI), vol. 13, no. 4, p. 4526, Dec. 2024. doi: 10.11591/ijai.v13.i4.pp4526-4536.

M. Amanullah, V. Selvakumar, A. Jyot, N. Purohit, S. Shitharth, and M. Fahlevi, “CNN-based prediction analysis for web phishing prevention,” in Int. Conf. Edge Comput. Appl. (ICECAA), 2022. doi: 10.1109/icecaa55415.2022.9936112.

P. Y and U. Sree, “Phishing website detection using machine learning,” J. Innov. Technol., vol. 2024, no. 1, Nov. 2024. doi: 10.61453/joit.v2024no30.

K. S. N. Sushma, M. Jayalakshmi, and T. Guha, “Deep learning for phishing website detection,” in MysuruCon 2022 - 2022 IEEE 2nd Mysore Sub Sect. Int. Conf., 2022. doi: 10.1109/mysurucon55714.2022.9972621.

D. Zinca and A. Negrea, “Comparative study of phishing URL detection using artificial intelligence algorithms,” in 2024 Int. Symp. Electron. Telecommun. (ISETC), Nov. 2024, pp. 1–4. doi: 10.1109/isetc63109.2024.10797218.

R. Wazirali, R. Ahmad, and A. A. K. Abu-Ein, “Sustaining accurate detection of phishing URLs using SDN and feature selection approaches,” Comput. Netw., vol. 201, 2021. doi: 10.1016/j.comnet.2021.108591.

M. El-Rashidy, “A smart model for web phishing detection based on new proposed feature selection technique,” Menoufia J. Electron. Eng. Res., vol. 0, no. 0, 2020. doi: 10.21608/mjeer.2020.32404.1021.

N. B. M. Noh and M. N. B. M. Basri, “Phishing website detection using random forest and support vector machine: A comparison,” in 2021 2nd Int. Conf. Artif. Intell. Data Sci. (AiDAS), 2021. doi: 10.1109/aidas53897.2021.9574282.

A. Aljofey et al., “An effective detection approach for phishing websites using URL and HTML features,” Sci. Rep., vol. 12, no. 1, 2022. doi: 10.1038/s41598-022-10841-5.

P. A. Barraclough, G. Fehringer, and J. Woodward, “Intelligent cyber-phishing detection for online,” Comput. Secur., vol. 104, 2021. doi: 10.1016/j.cose.2020.102123.

C. Catal, G. Giray, B. Tekinerdogan, S. Kumar, and S. Shukla, “Applications of deep learning for phishing detection: A systematic literature review,” Knowl. Inf. Syst., vol. 64, no. 6, 2022. doi: 10.1007/s10115-022-01672-x.

V. Borate, A. Adsul, R. Dhakane, S. Gawade, S. Ghodake, and P. Jadhav, “A comprehensive review of phishing attack detection using machine learning techniques,” Int. J. Adv. Res. Sci., Commun. Technol., pp. 435–441, Oct. 2024. doi: 10.48175/ijarsct-19963.

P. Sharma, B. Dash, and M. F. Ansari, “Anti-phishing techniques – A review of cyber defense mechanisms,” IJARCCE, vol. 11, no. 7, 2022. doi: 10.17148/ijarcce.2022.11728.

A. A. Orunsolu, A. S. Sodiya, and A. T. Akinwale, “A predictive model for phishing detection,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 2, 2022. doi: 10.1016/j.jksuci.2019.12.005.

S. Alnemari and M. Alshammari, “Detecting phishing domains using machine learning,” Appl. Sci., vol. 13, no. 8, 2023. doi: 10.3390/app13084649.

C. S. Shieh, W. W. Lin, T. T. Nguyen, C. H. Chen, M. F. Horng, and D. Miu, “Detection of unknown DDoS attacks with deep learning and Gaussian mixture model,” Appl. Sci., vol. 11, no. 11, 2021. doi: 10.3390/app11115213.

R. Mahajan and I. Siddavatam, “Phishing website detection using machine learning algorithms,” Int. J. Comput. Appl., vol. 181, no. 23, 2018. doi: 10.5120/ijca2018918026.

O. K. Sahingoz, E. Buber, O. Demir, and B. Diri, “Machine learning based phishing detection from URLs,” Expert Syst. Appl., vol. 117, 2019. doi: 10.1016/j.eswa.2018.09.029.

R. Amrish, K. Bavapriyan, V. Gopinaath, A. Jawahar, and C. V. Kumar, “DDoS detection using machine learning techniques,” J. ISMAC, vol. 4, no. 1, 2022. doi: 10.36548/jismac.2022.1.003.

S. P, K. S, R. K. S, G. P. G. R, P. M, and D. B, “Evaluating the efficacy of machine learning methods in phishing detection: A comparative analysis,” in 2024 IEEE Int. Conf. Blockchain Distrib. Syst. Secur. (ICBDS), Oct. 2024, pp. 1–7. doi: 10.1109/icbds61829.2024.10837224.

S. Merugula, K. S. Kumar, S. Muppidi, and C. Vidyadhari, “Stop phishing: Master anti-phishing techniques,” in 2022 IEEE North Karnataka Subsection Flagship Int. Conf. (NKCon), 2022. doi: 10.1109/nkcon56289.2022.10126569.

R. Tamilkodi, A. Harika, M. Harika, P. V. S. Abhilash, C. H. Sai, and P. Ps, “Enhanced security measures against phishing threats,” in 2024 5th Int. Conf. Data Intell. Cogn. Inform. (ICDICI), Nov. 2024, pp. 62–67. doi: 10.1109/icdici62993.2024.10810772.

“Enhanced phishing detection: An ensemble stacking model with DT-RFECV and SMOTE,” Appl. Math. Inf. Sci., vol. 18, no. 6, pp. 1481–1493, Nov. 2024. doi: 10.18576/amis/180624.

A. Pathak, P. Pandey, and V. Raheja, “Comparing different machine learning techniques for detecting phishing websites,” in AI in the Social and Business World: A Comprehensive Approach, Bentham Science Publishers, 2024, pp. 222–234. doi: 10.2174/9789815256864124010012.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution LicenseÂ that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (SeeÂ The Effect of Open Access).