MDAPT: Multi-Modal Depth Adversarial Prompt Tuning to Enhance the Adversarial Robustness of Visual Language Models

Large visual language models like Contrastive Language-Image Pre-training (CLIP), despite their excellent performance, are highly vulnerable to the influence of adversarial examples. This work investigates the accuracy and robustness of visual language models (VLMs) from a novel multi-modal perspect...

Full description

Saved in:
Bibliographic Details
Main Authors: Chao Li, Yonghao Liao, Caichang Ding, Zhiwei Ye
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/1/258
Tags: Add Tag
No Tags, Be the first to tag this record!