Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images

With the continuous advancement of deep neural networks, salient object detection (SOD) in natural images has made significant progress. However, SOD in optical remote sensing images (ORSI-SOD) remains a challenging task due to the diversity of objects and the complexity of backgrounds. The primary...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiaoning Zhang, Yi Yu, Daqun Li, Yuqing Wang
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Remote Sensing
Subjects:	salient object detection optical remote sensing images Segment Anything Model domain-specific prompting module progressive self-prompting decoder module parameter-efficient fine-tuning
Online Access:	https://www.mdpi.com/2072-4292/17/2/342
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832587561290170368
author	Xiaoning Zhang Yi Yu Daqun Li Yuqing Wang
author_facet	Xiaoning Zhang Yi Yu Daqun Li Yuqing Wang
author_sort	Xiaoning Zhang
collection	DOAJ
description	With the continuous advancement of deep neural networks, salient object detection (SOD) in natural images has made significant progress. However, SOD in optical remote sensing images (ORSI-SOD) remains a challenging task due to the diversity of objects and the complexity of backgrounds. The primary challenge lies in generating robust features that can effectively integrate both global semantic information for salient object localization and local spatial details for boundary reconstruction. Most existing ORSI-SOD methods rely on pre-trained CNN- or Transformer-based backbones to extract features from ORSIs, followed by multi-level feature aggregation. Given the significant differences between ORSIs and the natural images used in pre-training, the generalization capability of these backbone networks is often limited, resulting in suboptimal performance. Recently, prompt engineering has been employed to enhance the generalization ability of networks in the Segment Anything Model (SAM), an emerging vision foundation model that has achieved remarkable success across various tasks. Despite its success, directly applying the SAM to ORSI-SOD without prompts from manual interaction remains unsatisfactory. In this paper, we propose a novel progressive self-prompting model based on the SAM, termed PSP-SAM, which generates both internal and external prompts to enhance the network and overcome the limitations of SAM in ORSI-SOD. Specifically, domain-specific prompting modules, consisting of both block-shared and block-specific adapters, are integrated into the network to learn domain-specific visual prompts within the backbone, facilitating its adaptation to ORSI-SOD. Furthermore, we introduce a progressive self-prompting decoder module that performs prompt-guided multi-level feature integration and generates stage-wise mask prompts progressively, enabling the prompt-based mask decoders outside the backbone to predict saliency maps in a coarse-to-fine manner. The entire network is trained end-to-end with parameter-efficient fine-tuning. Extensive experiments on three benchmark ORSI-SOD datasets demonstrate that our proposed network achieves state-of-the-art performance.
format	Article
id	doaj-art-3ef7cd6f267e4602879f351105e01166
institution	Kabale University
issn	2072-4292
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj-art-3ef7cd6f267e4602879f351105e011662025-01-24T13:48:11ZengMDPI AGRemote Sensing2072-42922025-01-0117234210.3390/rs17020342Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing ImagesXiaoning Zhang0Yi Yu1Daqun Li2Yuqing Wang3Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaChangchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, ChinaWith the continuous advancement of deep neural networks, salient object detection (SOD) in natural images has made significant progress. However, SOD in optical remote sensing images (ORSI-SOD) remains a challenging task due to the diversity of objects and the complexity of backgrounds. The primary challenge lies in generating robust features that can effectively integrate both global semantic information for salient object localization and local spatial details for boundary reconstruction. Most existing ORSI-SOD methods rely on pre-trained CNN- or Transformer-based backbones to extract features from ORSIs, followed by multi-level feature aggregation. Given the significant differences between ORSIs and the natural images used in pre-training, the generalization capability of these backbone networks is often limited, resulting in suboptimal performance. Recently, prompt engineering has been employed to enhance the generalization ability of networks in the Segment Anything Model (SAM), an emerging vision foundation model that has achieved remarkable success across various tasks. Despite its success, directly applying the SAM to ORSI-SOD without prompts from manual interaction remains unsatisfactory. In this paper, we propose a novel progressive self-prompting model based on the SAM, termed PSP-SAM, which generates both internal and external prompts to enhance the network and overcome the limitations of SAM in ORSI-SOD. Specifically, domain-specific prompting modules, consisting of both block-shared and block-specific adapters, are integrated into the network to learn domain-specific visual prompts within the backbone, facilitating its adaptation to ORSI-SOD. Furthermore, we introduce a progressive self-prompting decoder module that performs prompt-guided multi-level feature integration and generates stage-wise mask prompts progressively, enabling the prompt-based mask decoders outside the backbone to predict saliency maps in a coarse-to-fine manner. The entire network is trained end-to-end with parameter-efficient fine-tuning. Extensive experiments on three benchmark ORSI-SOD datasets demonstrate that our proposed network achieves state-of-the-art performance.https://www.mdpi.com/2072-4292/17/2/342salient object detectionoptical remote sensing imagesSegment Anything Modeldomain-specific prompting moduleprogressive self-prompting decoder moduleparameter-efficient fine-tuning
spellingShingle	Xiaoning Zhang Yi Yu Daqun Li Yuqing Wang Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images Remote Sensing salient object detection optical remote sensing images Segment Anything Model domain-specific prompting module progressive self-prompting decoder module parameter-efficient fine-tuning
title	Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images
title_full	Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images
title_fullStr	Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images
title_full_unstemmed	Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images
title_short	Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images
title_sort	progressive self prompting segment anything model for salient object detection in optical remote sensing images
topic	salient object detection optical remote sensing images Segment Anything Model domain-specific prompting module progressive self-prompting decoder module parameter-efficient fine-tuning
url	https://www.mdpi.com/2072-4292/17/2/342
work_keys_str_mv	AT xiaoningzhang progressiveselfpromptingsegmentanythingmodelforsalientobjectdetectioninopticalremotesensingimages AT yiyu progressiveselfpromptingsegmentanythingmodelforsalientobjectdetectioninopticalremotesensingimages AT daqunli progressiveselfpromptingsegmentanythingmodelforsalientobjectdetectioninopticalremotesensingimages AT yuqingwang progressiveselfpromptingsegmentanythingmodelforsalientobjectdetectioninopticalremotesensingimages

Progressive Self-Prompting Segment Anything Model for Salient Object Detection in Optical Remote Sensing Images

Similar Items