ALBERTA: ALgorithm-Based Error Resilience in Transformer Architectures

Vision Transformers are being increasingly deployed in safety-critical applications that demand high reliability. Ensuring the correct execution of these models in GPUs is critical, despite the potential for transient hardware errors. We propose a novel algorithm-based resilience framework called AL...

Full description

Saved in:
Bibliographic Details
Main Authors: Haoxuan Liu, Vasu Singh, Michal Filipiuk, Siva Kumar Sastry Hari
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Open Journal of the Computer Society
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10530530/
Tags: Add Tag
No Tags, Be the first to tag this record!