TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change Detection
Semantic change detection (SCD) in remote sensing image aims to identify semantic alterations between bi-temporal images captured at the same geographic location. SCD is extensively applied in fields such as environmental monitoring and disaster assessment. Despite significant advancements in deep l...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Copernicus Publications
2025-07-01
|
| Series: | ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
| Online Access: | https://isprs-annals.copernicus.org/articles/X-G-2025/383/2025/isprs-annals-X-G-2025-383-2025.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849321094728646656 |
|---|---|
| author | H. Huang Q. Cheng Q. Cheng D. Zhu X. Huang Q. Zhao |
| author_facet | H. Huang Q. Cheng Q. Cheng D. Zhu X. Huang Q. Zhao |
| author_sort | H. Huang |
| collection | DOAJ |
| description | Semantic change detection (SCD) in remote sensing image aims to identify semantic alterations between bi-temporal images captured at the same geographic location. SCD is extensively applied in fields such as environmental monitoring and disaster assessment. Despite significant advancements in deep learning leading to numerous successful approaches, most existing methods primarily rely on visual representation learning, thereby overlooking the potential benefits of multimodal data. Recently, vision-language models have demonstrated outstanding performance across various downstream tasks. In this paper, we propose a novel framework named TextSCD that leverages text-based semantic information to guide the generation of semantic change maps. Our approach integrates Gemini to generate change descriptions between bi-temporal images and employs a multi-level semantic extraction method to capture features from both images and their corresponding captions. Furthermore, we introduce a semantic text-guided interaction module that facilitates the effective integration of visual and textual features, enhancing multimodal knowledge transfer and the extraction of discriminative features. This design effectively reduces false detections and omissions. We validate the effectiveness of our model on the SECOND dataset, achieving notable improvements in overall accuracy for semantic change detection. |
| format | Article |
| id | doaj-art-adce27f05a5845f687086ddbae10e930 |
| institution | Kabale University |
| issn | 2194-9042 2194-9050 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Copernicus Publications |
| record_format | Article |
| series | ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
| spelling | doaj-art-adce27f05a5845f687086ddbae10e9302025-08-20T03:49:50ZengCopernicus PublicationsISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences2194-90422194-90502025-07-01X-G-202538338910.5194/isprs-annals-X-G-2025-383-2025TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change DetectionH. Huang0Q. Cheng1Q. Cheng2D. Zhu3X. Huang4Q. Zhao5State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, ChinaSchool of Electronics Information and Communications, Huazhong University of Science and Technology, Wuhan, ChinaUrban Big Data Centre, School of Social and Political Sciences, University of Glasgow, Glasgow, UKState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, ChinaDepartment of Geosciences, University of Arkansas, Fayetteville, USAUrban Big Data Centre, School of Social and Political Sciences, University of Glasgow, Glasgow, UKSemantic change detection (SCD) in remote sensing image aims to identify semantic alterations between bi-temporal images captured at the same geographic location. SCD is extensively applied in fields such as environmental monitoring and disaster assessment. Despite significant advancements in deep learning leading to numerous successful approaches, most existing methods primarily rely on visual representation learning, thereby overlooking the potential benefits of multimodal data. Recently, vision-language models have demonstrated outstanding performance across various downstream tasks. In this paper, we propose a novel framework named TextSCD that leverages text-based semantic information to guide the generation of semantic change maps. Our approach integrates Gemini to generate change descriptions between bi-temporal images and employs a multi-level semantic extraction method to capture features from both images and their corresponding captions. Furthermore, we introduce a semantic text-guided interaction module that facilitates the effective integration of visual and textual features, enhancing multimodal knowledge transfer and the extraction of discriminative features. This design effectively reduces false detections and omissions. We validate the effectiveness of our model on the SECOND dataset, achieving notable improvements in overall accuracy for semantic change detection.https://isprs-annals.copernicus.org/articles/X-G-2025/383/2025/isprs-annals-X-G-2025-383-2025.pdf |
| spellingShingle | H. Huang Q. Cheng Q. Cheng D. Zhu X. Huang Q. Zhao TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change Detection ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences |
| title | TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change Detection |
| title_full | TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change Detection |
| title_fullStr | TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change Detection |
| title_full_unstemmed | TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change Detection |
| title_short | TextSCD: Leveraging Text-based Semantic Guidance for Remote Sensing Image Semantic Change Detection |
| title_sort | textscd leveraging text based semantic guidance for remote sensing image semantic change detection |
| url | https://isprs-annals.copernicus.org/articles/X-G-2025/383/2025/isprs-annals-X-G-2025-383-2025.pdf |
| work_keys_str_mv | AT hhuang textscdleveragingtextbasedsemanticguidanceforremotesensingimagesemanticchangedetection AT qcheng textscdleveragingtextbasedsemanticguidanceforremotesensingimagesemanticchangedetection AT qcheng textscdleveragingtextbasedsemanticguidanceforremotesensingimagesemanticchangedetection AT dzhu textscdleveragingtextbasedsemanticguidanceforremotesensingimagesemanticchangedetection AT xhuang textscdleveragingtextbasedsemanticguidanceforremotesensingimagesemanticchangedetection AT qzhao textscdleveragingtextbasedsemanticguidanceforremotesensingimagesemanticchangedetection |