Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education
Abstract Browsing histories can be a valuable resource for cybersecurity, research, and testing. Individuals are often reluctant to share their browsing histories online, and the use of personal data requires obtaining signed informed consent. Research shows that anonymized histories can lead to re-...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Data |
Online Access: | https://doi.org/10.1038/s41597-025-04407-z |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832586098766774272 |
---|---|
author | Dan Komosny Saeed Ur Rehman Muhammad Sohaib Ayub |
author_facet | Dan Komosny Saeed Ur Rehman Muhammad Sohaib Ayub |
author_sort | Dan Komosny |
collection | DOAJ |
description | Abstract Browsing histories can be a valuable resource for cybersecurity, research, and testing. Individuals are often reluctant to share their browsing histories online, and the use of personal data requires obtaining signed informed consent. Research shows that anonymized histories can lead to re-identification, nullifying the anonymity promised by informed consent. In this work, we present 500 synthetic browsing histories valid for 50 countries worldwide. The synthetic histories are compiled based on real browsing data using a series of transformation criteria, including website content, popularity, locality, and language, ensuring their validity for the respective countries. Each history maintains the order of webpage accesses and covers a one-month period. The motivation for publishing this dataset arises from the community’s call for browsing histories from different countries for research, development, and education. The published synthetic browsing histories can be used for any purpose without legal restrictions. |
format | Article |
id | doaj-art-f4c24fd09a974f97831f854779798b08 |
institution | Kabale University |
issn | 2052-4463 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Data |
spelling | doaj-art-f4c24fd09a974f97831f854779798b082025-01-26T12:14:33ZengNature PortfolioScientific Data2052-44632025-01-0112111110.1038/s41597-025-04407-zSynthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and EducationDan Komosny0Saeed Ur Rehman1Muhammad Sohaib Ayub2Department of Telecommunications, Brno University of TechnologyCollege of Science and Engineering, Flinders UniversitySchool of Science and Engineering, Lahore University of Management SciencesAbstract Browsing histories can be a valuable resource for cybersecurity, research, and testing. Individuals are often reluctant to share their browsing histories online, and the use of personal data requires obtaining signed informed consent. Research shows that anonymized histories can lead to re-identification, nullifying the anonymity promised by informed consent. In this work, we present 500 synthetic browsing histories valid for 50 countries worldwide. The synthetic histories are compiled based on real browsing data using a series of transformation criteria, including website content, popularity, locality, and language, ensuring their validity for the respective countries. Each history maintains the order of webpage accesses and covers a one-month period. The motivation for publishing this dataset arises from the community’s call for browsing histories from different countries for research, development, and education. The published synthetic browsing histories can be used for any purpose without legal restrictions.https://doi.org/10.1038/s41597-025-04407-z |
spellingShingle | Dan Komosny Saeed Ur Rehman Muhammad Sohaib Ayub Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education Scientific Data |
title | Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education |
title_full | Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education |
title_fullStr | Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education |
title_full_unstemmed | Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education |
title_short | Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education |
title_sort | synthetic browsing histories for 50 countries worldwide datasets for research development and education |
url | https://doi.org/10.1038/s41597-025-04407-z |
work_keys_str_mv | AT dankomosny syntheticbrowsinghistoriesfor50countriesworldwidedatasetsforresearchdevelopmentandeducation AT saeedurrehman syntheticbrowsinghistoriesfor50countriesworldwidedatasetsforresearchdevelopmentandeducation AT muhammadsohaibayub syntheticbrowsinghistoriesfor50countriesworldwidedatasetsforresearchdevelopmentandeducation |