A Study on Browser Fingerprinting Uniqueness Using Clustering Methods and Entropy Validation

Vicki Wei Qi Lee (1), Shih Yin Ooi (2), Ying Han Pang (3), Kiu Nai Pau (4)
(1) Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka, Malaysia
(2) Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka, Malaysia
(3) Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka, Malaysia
(4) Faculty of Information Science and Technology, Multimedia University, Jalan Ayer Keroh Lama, Bukit Beruang, Melaka, Malaysia
Fulltext View | Download
How to cite (IJASEIT) :
Lee , Vicki Wei Qi, et al. “A Study on Browser Fingerprinting Uniqueness Using Clustering Methods and Entropy Validation”. International Journal on Advanced Science, Engineering and Information Technology, vol. 14, no. 6, Dec. 2024, pp. 1991-00, doi:10.18517/ijaseit.14.6.16396.
Browser fingerprint is often linked to privacy as it is a method to gather data about the browser's configuration to identify the user. The browser’s configurations, which are also known as attributes, are the keys to make the user to be identified. Web browsers explicitly disclose information about the host system to websites by making it available to them, such as attributes like the screen resolution, local time, or operating system (OS) version. Since each of the browsers has different attributes that make each unique, it is essential to understand the attributes well. This research paper emphasizes the method of collecting data for browser fingerprinting and ensuring the acquisition of fingerprint data without compromising personal information. One of the research motivations is to transform this data into an easily accessible raw dataset for the industry's utilization in future research projects. Additionally, the study explores the potential use of Shannon Entropy to unveil distinctive attributes in browser fingerprinting, revealing that higher entropy values correlate with more distinct and recognizable fingerprints. The other purpose is to discover which attribute produces the highest unique value using the clustering algorithm. Experiment results showed that if the attribute is unique, it will be hard to cluster into groups. This can be proved by using a clustering algorithm where the unique attributes will have a high value in the incorrectly clustered instances because it is harder to be clustered.

M. A. I. Mohd Aminuddin, Z. F. Zaaba, A. Samsudin, F. Zaki, and N. B. Anuar, “The rise of website fingerprinting on Tor: Analysis on techniques and assumptions,” Journal of Network and Computer Applications, vol. 212, p. 103582, Mar. 2023, doi:10.1016/j.jnca.2023.103582.

D. Zhang, J. Zhang, Y. Bu, B. Chen, C. Sun, and T. Wang, “A Survey of Browser Fingerprint Research and Application,” Wireless Communications and Mobile Computing, vol. 2022, pp. 1–14, Nov. 2022, doi: 10.1155/2022/3363335.

P. Eckersley, “How Unique Is Your Web Browser?,” Privacy Enhancing Technologies, pp. 1–18, 2010, doi: 10.1007/978-3-642-14527-8_1.

P. Laperdrix, N. Bielova, B. Baudry, and G. Avoine, “Browser Fingerprinting,” ACM Transactions on the Web, vol. 14, no. 2, pp. 1–33, Apr. 2020, doi: 10.1145/3386040.

A. Gómez-Boix, P. Laperdrix, and B. Baudry, “Hiding in the Crowd,” Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW ’18, pp. 309–318, 2018, doi: 10.1145/3178876.3186097.

B. M. Berens, M. Bohlender, H. Dietmann, C. Krisam, O. Kulyk, and M. Volkamer, “Cookie disclaimers: Dark patterns and lack of transparency,” Computers & Security, vol. 136, p. 103507, Jan. 2024, doi: 10.1016/j.cose.2023.103507.

R. Pan and A. Ruiz-Martínez, “Evolution of web tracking protection in Chrome,” Journal of Information Security and Applications, vol. 79, p. 103643, Dec. 2023, doi: 10.1016/j.jisa.2023.103643.

U. Iqbal, S. Englehardt, and Z. Shafiq, “Fingerprinting the Fingerprinters: Learning to Detect Browser Fingerprinting Behaviors,” 2021 IEEE Symposium on Security and Privacy (SP), May 2021, doi: 10.1109/sp40001.2021.00017.

I. Fouad, C. Santos, A. Legout and N. Bielova, "Did I delete my cookies? Cookies respawning with browser fingerprinting", 2021.

P. Laperdrix, W. Rudametkin, and B. Baudry, “Beauty and the Beast: Diverting Modern Web Browsers to Build Unique Browser Fingerprints,” 2016 IEEE Symposium on Security and Privacy (SP), pp. 878–894, May 2016, doi: 10.1109/sp.2016.57.

A. Vastel, P. Laperdrix, W. Rudametkin, and R. Rouvoy, “FP-STALKER: Tracking Browser Fingerprint Evolutions,” 2018 IEEE Symposium on Security and Privacy (SP), pp. 728–741, May 2018, doi: 10.1109/sp.2018.00008.

K. N. Pau, V. W. Q. Lee, S. Y. Ooi, and Y. H. Pang, “The Development of a Data Collection and Browser Fingerprinting System,” Sensors, vol. 23, no. 6, p. 3087, Mar. 2023, doi: 10.3390/s23063087.

L. Polčák, M. Saloň, G. Maone, R. Hranický, and M. McMahon, “JShelter: Give Me My Browser Back,” Proceedings of the 20th International Conference on Security and Cryptography, pp. 287–294, 2023, doi: 10.5220/0011965600003555.

A. Hoayek and D. Rullière, “Assessing clustering methods using Shannon’s entropy,” Information Sciences, vol. 689, p. 121510, Jan. 2025, doi: 10.1016/j.ins.2024.121510.

C. E. Shannon, “A Mathematical Theory of Communication,” Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, Jul. 1948, doi: 10.1002/j.1538-7305.1948.tb01338.x.

M. L. Morrison and N. A. Rosenberg, “Mathematical bounds on Shannon entropy given the abundance of the ith most abundant taxon,” Journal of Mathematical Biology, vol. 87, no. 5, Oct. 2023, doi: 10.1007/s00285-023-01997-3.

W. A. Kreiner, “First Digits’ Shannon Entropy,” Entropy, vol. 24, no. 10, p. 1413, Oct. 2022, doi: 10.3390/e24101413.

C. Thota, C. Mavromoustakis, and G. Mastorakis, “CAP2M: Contingent Anonymity Preserving Privacy Method for the Internet of Things Services,” Computers and Electrical Engineering, vol. 107, p. 108640, Apr. 2023, doi: 10.1016/j.compeleceng.2023.108640.

V. W. Q. Lee, S. Y. Ooi, and Y. H. Pang, “Assessing the Importance of Browser Fingerprint Attributes towards User Profiling through Clustering Algorithms,” 2023 IEEE 13th Symposium on Computer Applications & Industrial Electronics (ISCAIE), pp. 326–331, May 2023, doi: 10.1109/iscaie57739.2023.10165492.

R. Zhao, “Toward the flow-centric detection of browser fingerprinting,” Computers & Security, vol. 137, p. 103642, Feb. 2024, doi: 10.1016/j.cose.2023.103642.

A. A. Salomatin, A. Yu. Iskhakov, and R. V. Meshcheryakov, “Comparison of the Effectiveness of Countermeasures Against Tracking User Browser Fingerprints,” IFAC-PapersOnLine, vol. 55, no. 9, pp. 244–249, 2022, doi: 10.1016/j.ifacol.2022.07.043.

J. Kumuthini et al., “Genomics data sharing,” Genomic Data Sharing, pp. 111–135, 2023, doi: 10.1016/b978-0-12-819803-2.00003-1.

S. Jayanthy, A. Arunkumar, J. J. A. Kovilpillai, M. Bhuvardhena, and K. D. Pandian, “Secured Health Data Sharing System using IPFS and Blockchain with Beacon Proxy,” Procedia Computer Science, vol. 230, pp. 788–797, 2023, doi: 10.1016/j.procs.2023.12.054.

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Information Sciences, vol. 622, pp. 178–210, Apr. 2023, doi: 10.1016/j.ins.2022.11.139.

J. Redha and J. Redha Mutar, “A Review of Clustering Algorithms,” International Journal of Computer Science and Mobile Applications, vol. 10, pp. 44–50, 2022.

I. H. Witten, E. Frank, and M. A. Hall, “Introduction to Weka,” Data Mining: Practical Machine Learning Tools and Techniques, pp. 403–406, 2011, doi: 10.1016/b978-0-12-374856-0.00010-9.

A. Gómez-Boix, D. Frey, Y.-D. Bromberg, and B. Baudry, “A Collaborative Strategy for Mitigating Tracking through Browser Fingerprinting,” Proceedings of the 6th ACM Workshop on Moving Target Defense, pp. 67–78, Nov. 2019, doi: 10.1145/3338468.3356828.

F. Zou and H. Zhai, “Browser Fingerprinting Identification Using Incremental Clustering Algorithm Based on Autoencoder,” 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), Dec. 2021, doi: 10.1109/hpcc-dss-smartcity-dependsys53884.2021.00093.

Z. Ding, W. Zhou, and Z. Zhou, “Configuration-Based Fingerprinting of Mobile Device Using Incremental Clustering,” IEEE Access, vol. 6, pp. 72402–72414, 2018, doi: 10.1109/access.2018.2880451.

E. Conrad, S. Misenar, and J. Feldman, “Domain 4: Communication and Network Security,” CISSP® Study Guide, pp. 225–293, 2023, doi: 10.1016/b978-0-443-18734-6.00003-9.

C. Y. Seek, S. Y. Ooi, Y. H. Pang, S. L. Lew, and X. Y. Heng, “Elderly and Smartphone Apps: Case Study with Lightweight MySejahtera,” Journal of Informatics and Web Engineering, vol. 2, no. 1, pp. 13–24, Mar. 2023, doi: 10.33093/jiwe.2023.2.1.2.

Y. H. Tay, S. Y. Ooi, Y. H. Pang, Y. H. Gan, and S. L. Lew, “Ensuring Privacy and Security on Banking Websites in Malaysia: A Cookies Scanner Solution,” Journal of Informatics and Web Engineering, vol. 2, no. 2, pp. 153–167, Sep. 2023, doi: 10.33093/jiwe.2023.2.2.12.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

    1. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
    2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
    3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).