Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education

Abstract Browsing histories can be a valuable resource for cybersecurity, research, and testing. Individuals are often reluctant to share their browsing histories online, and the use of personal data requires obtaining signed informed consent. Research shows that anonymized histories can lead to re-...

Full description

Saved in:
Bibliographic Details
Main Authors: Dan Komosny, Saeed Ur Rehman, Muhammad Sohaib Ayub
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-04407-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832586098766774272
author Dan Komosny
Saeed Ur Rehman
Muhammad Sohaib Ayub
author_facet Dan Komosny
Saeed Ur Rehman
Muhammad Sohaib Ayub
author_sort Dan Komosny
collection DOAJ
description Abstract Browsing histories can be a valuable resource for cybersecurity, research, and testing. Individuals are often reluctant to share their browsing histories online, and the use of personal data requires obtaining signed informed consent. Research shows that anonymized histories can lead to re-identification, nullifying the anonymity promised by informed consent. In this work, we present 500 synthetic browsing histories valid for 50 countries worldwide. The synthetic histories are compiled based on real browsing data using a series of transformation criteria, including website content, popularity, locality, and language, ensuring their validity for the respective countries. Each history maintains the order of webpage accesses and covers a one-month period. The motivation for publishing this dataset arises from the community’s call for browsing histories from different countries for research, development, and education. The published synthetic browsing histories can be used for any purpose without legal restrictions.
format Article
id doaj-art-f4c24fd09a974f97831f854779798b08
institution Kabale University
issn 2052-4463
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-f4c24fd09a974f97831f854779798b082025-01-26T12:14:33ZengNature PortfolioScientific Data2052-44632025-01-0112111110.1038/s41597-025-04407-zSynthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and EducationDan Komosny0Saeed Ur Rehman1Muhammad Sohaib Ayub2Department of Telecommunications, Brno University of TechnologyCollege of Science and Engineering, Flinders UniversitySchool of Science and Engineering, Lahore University of Management SciencesAbstract Browsing histories can be a valuable resource for cybersecurity, research, and testing. Individuals are often reluctant to share their browsing histories online, and the use of personal data requires obtaining signed informed consent. Research shows that anonymized histories can lead to re-identification, nullifying the anonymity promised by informed consent. In this work, we present 500 synthetic browsing histories valid for 50 countries worldwide. The synthetic histories are compiled based on real browsing data using a series of transformation criteria, including website content, popularity, locality, and language, ensuring their validity for the respective countries. Each history maintains the order of webpage accesses and covers a one-month period. The motivation for publishing this dataset arises from the community’s call for browsing histories from different countries for research, development, and education. The published synthetic browsing histories can be used for any purpose without legal restrictions.https://doi.org/10.1038/s41597-025-04407-z
spellingShingle Dan Komosny
Saeed Ur Rehman
Muhammad Sohaib Ayub
Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education
Scientific Data
title Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education
title_full Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education
title_fullStr Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education
title_full_unstemmed Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education
title_short Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education
title_sort synthetic browsing histories for 50 countries worldwide datasets for research development and education
url https://doi.org/10.1038/s41597-025-04407-z
work_keys_str_mv AT dankomosny syntheticbrowsinghistoriesfor50countriesworldwidedatasetsforresearchdevelopmentandeducation
AT saeedurrehman syntheticbrowsinghistoriesfor50countriesworldwidedatasetsforresearchdevelopmentandeducation
AT muhammadsohaibayub syntheticbrowsinghistoriesfor50countriesworldwidedatasetsforresearchdevelopmentandeducation