Practical Implementation of Federated Learning for Detecting Backdoor Attacks in a Next-word Prediction Model

Abstract This article details the development of a next-word prediction model utilizing federated learning and introduces a mechanism for detecting backdoor attacks. Federated learning enables multiple devices to collaboratively train a shared model while retaining data locally. However, this decent...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jimmy K. W. Wong, Ki Ki Chung, Yuen Wing Lo, Chun Yin Lai, Steve W. Y. Mung
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-024-82079-2
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract This article details the development of a next-word prediction model utilizing federated learning and introduces a mechanism for detecting backdoor attacks. Federated learning enables multiple devices to collaboratively train a shared model while retaining data locally. However, this decentralized approach is susceptible to manipulation by malicious actors who control a subset of participating devices, thereby biasing the model’s outputs on specific topics, such as a presidential election. The proposed detection mechanism aims to identify and exclude devices with anomalous datasets from the training process, thereby mitigating the influence of such attacks. By using the example of a presidential election, the study demonstrates a positive correlation between the proportion of compromised devices and the degree of bias in the model’s outputs. The findings indicate that the detection mechanism effectively reduces the impact of backdoor attacks, particularly when the number of compromised devices is relatively low. This research contributes to enhancing the robustness of federated learning systems against malicious manipulation, ensuring more reliable and unbiased model performance.
ISSN:	2045-2322

Practical Implementation of Federated Learning for Detecting Backdoor Attacks in a Next-word Prediction Model

Similar Items