StegGPT: A Novel Foundation-Model-Based Character-Level Linguistic Steganography Method Utilizing Large Language Models

This study addresses the critical need for robust safeguarding of sensitive data stored on personal computing devices and during data transmissions, alongside the increasing need for secure digital interactions. Conventional methodologies for obfuscating data within textual covers exhibit inherent l...

Full description

Saved in:
Bibliographic Details
Main Authors: Omer Farooq Ahmed Adeeb, Seyed Jahanshah Kabudian
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11008605/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study addresses the critical need for robust safeguarding of sensitive data stored on personal computing devices and during data transmissions, alongside the increasing need for secure digital interactions. Conventional methodologies for obfuscating data within textual covers exhibit inherent limitations and susceptibility to detection. The primary objective of this investigation is to devise an algorithm that not only ensures secure transmission of information, but also proficiently conceals it from unauthorized access and detection. Using advanced techniques in Natural Language Processing (NLP), Artificial Intelligence (AI), and deep learning within the domain of information security, this study delves into the realm of steganography, revealing the restricted embedding capabilities of conventional language-centric approaches. A comparative analysis pits the newly minted algorithm against contemporaneous approaches, notably cutting-edge neural linguistic steganography (NLS), evaluating their algorithmic capacities in terms of Bits per Word (BpW) and Bits per Character (BpC), along with gauging their security and imperceptibility through metrics like Area Under the Curve (AUC), Equal Error Rate (EER) and Difference of Mean Perplexity (<inline-formula> <tex-math notation="LaTeX">$\Delta $ </tex-math></inline-formula> MP). Findings underscore the marked superiority of the proposed steganography algorithm in embedding capacity metrics, while upholding comparable standards of security and imperceptibility compared to other AI-driven statistical (Markov chain-based) and neural (deep learning-based) techniques. Specifically, the StegGPT algorithm showcases a remarkable 44% increase in Word-level capacity criterion (from 2.97 to 4.27) and 53% increase in Character-level capacity criterion (from 0.51 to 0.78) in comparison to its closest competitor, all while maintaining consistent levels of security and imperceptibility.
ISSN:2169-3536