A CAE model-based secure deduplication method

Abstract Cloud storage services are widely used due to their convenience and flexibility. However, the presence of a large amount of duplicate data in the cloud imposes a significant storage burden and increases the risk of privacy breaches. Random Message Locked Encryption (R-MLE) is an effective t...

Full description

Saved in:
Bibliographic Details
Main Authors: Chunbo Wang, Guoying Zhang, Hui Qi, Bin Chen
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-09788-0
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Cloud storage services are widely used due to their convenience and flexibility. However, the presence of a large amount of duplicate data in the cloud imposes a significant storage burden and increases the risk of privacy breaches. Random Message Locked Encryption (R-MLE) is an effective tool for secure deduplication of cloud data. However, since it is based on bilinear mapping, the comparison of fingerprint tags during deduplication results in substantial computational overhead. To address this issue, we propose a secure deduplication method based on an Autoencoder model. The summary tags generated by the model are used to reduce the number of fingerprint tag comparisons, thereby improving deduplication efficiency. Building on this, this paper further introduces a secure deduplication method based on a Convolutional Autoencoder (CAE) model, which utilizes convolution and pooling operations to reduce the number of parameters in the Convolutional Autoencoder model, thereby decreasing computational and storage overhead. Additionally, it effectively mitigates the problem of overfitting. Experiments conducted on the source code dataset indicate that the proposed approach yields superior deduplication efficiency, reduced model storage requirements, and a more uniform distribution.
ISSN:2045-2322