An Optimized Sanitization Approach for Minable Data Publication
Minable data publication is ubiquitous since it is beneficial to sharing/trading data among commercial companies and further facilitates the development of data-driven tasks. Unfortunately, the minable data publication is often implemented by publishers with limited privacy concerns such that the pu...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2022-09-01
|
Series: | Big Data Mining and Analytics |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2022.9020007 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832572940335448064 |
---|---|
author | Fan Yang Xiaofeng Liao |
author_facet | Fan Yang Xiaofeng Liao |
author_sort | Fan Yang |
collection | DOAJ |
description | Minable data publication is ubiquitous since it is beneficial to sharing/trading data among commercial companies and further facilitates the development of data-driven tasks. Unfortunately, the minable data publication is often implemented by publishers with limited privacy concerns such that the published dataset is minable by malicious entities. It prohibits minable data publication since the published data may contain sensitive information. Thus, it is urgently demanded to present some approaches and technologies for reducing the privacy leakage risks. To this end, in this paper, we propose an optimized sanitization approach for minable data publication (named as SA-MDP). SA-MDP supports association rules mining function while providing privacy protection for specific rules. In SA-MDP, we consider the trade-off between the data utility and the data privacy in the minable data publication problem. To address this problem, SA-MDP designs a customized particle swarm optimization (PSO) algorithm, where the optimization objective is determined by both the data utility and the data privacy. Specifically, we take advantage of PSO to produce new particles, which is achieved by random mutation or learning from the best particle. Hence, SA-MDP can avoid the solutions being trapped into local optima. Besides, we design a proper fitness function to guide the particles to run towards the optimal solution. Additionally, we present a preprocessing method before the evolution process of the customized PSO algorithm to improve the convergence rate. Finally, the proposed SA-MDP approach is performed and verified over several datasets. The experimental results have demonstrated the effectiveness and efficiency of SA-MDP. |
format | Article |
id | doaj-art-43c9f6e945a44f36aaa33dc96a44aa6d |
institution | Kabale University |
issn | 2096-0654 |
language | English |
publishDate | 2022-09-01 |
publisher | Tsinghua University Press |
record_format | Article |
series | Big Data Mining and Analytics |
spelling | doaj-art-43c9f6e945a44f36aaa33dc96a44aa6d2025-02-02T06:14:03ZengTsinghua University PressBig Data Mining and Analytics2096-06542022-09-015325726910.26599/BDMA.2022.9020007An Optimized Sanitization Approach for Minable Data PublicationFan Yang0Xiaofeng Liao1College of Computer Science, Chongqing University, Chongqing 400044, ChinaCollege of Computer Science, Chongqing University, Chongqing 400044, ChinaMinable data publication is ubiquitous since it is beneficial to sharing/trading data among commercial companies and further facilitates the development of data-driven tasks. Unfortunately, the minable data publication is often implemented by publishers with limited privacy concerns such that the published dataset is minable by malicious entities. It prohibits minable data publication since the published data may contain sensitive information. Thus, it is urgently demanded to present some approaches and technologies for reducing the privacy leakage risks. To this end, in this paper, we propose an optimized sanitization approach for minable data publication (named as SA-MDP). SA-MDP supports association rules mining function while providing privacy protection for specific rules. In SA-MDP, we consider the trade-off between the data utility and the data privacy in the minable data publication problem. To address this problem, SA-MDP designs a customized particle swarm optimization (PSO) algorithm, where the optimization objective is determined by both the data utility and the data privacy. Specifically, we take advantage of PSO to produce new particles, which is achieved by random mutation or learning from the best particle. Hence, SA-MDP can avoid the solutions being trapped into local optima. Besides, we design a proper fitness function to guide the particles to run towards the optimal solution. Additionally, we present a preprocessing method before the evolution process of the customized PSO algorithm to improve the convergence rate. Finally, the proposed SA-MDP approach is performed and verified over several datasets. The experimental results have demonstrated the effectiveness and efficiency of SA-MDP.https://www.sciopen.com/article/10.26599/BDMA.2022.9020007data publicationdata sanitizationassociation rules hidingevolutionary algorithm |
spellingShingle | Fan Yang Xiaofeng Liao An Optimized Sanitization Approach for Minable Data Publication Big Data Mining and Analytics data publication data sanitization association rules hiding evolutionary algorithm |
title | An Optimized Sanitization Approach for Minable Data Publication |
title_full | An Optimized Sanitization Approach for Minable Data Publication |
title_fullStr | An Optimized Sanitization Approach for Minable Data Publication |
title_full_unstemmed | An Optimized Sanitization Approach for Minable Data Publication |
title_short | An Optimized Sanitization Approach for Minable Data Publication |
title_sort | optimized sanitization approach for minable data publication |
topic | data publication data sanitization association rules hiding evolutionary algorithm |
url | https://www.sciopen.com/article/10.26599/BDMA.2022.9020007 |
work_keys_str_mv | AT fanyang anoptimizedsanitizationapproachforminabledatapublication AT xiaofengliao anoptimizedsanitizationapproachforminabledatapublication AT fanyang optimizedsanitizationapproachforminabledatapublication AT xiaofengliao optimizedsanitizationapproachforminabledatapublication |