Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching

Cross-view matching refers to the use of images from different platforms (e.g. drone and satellite views) to retrieve the most relevant images, where the key is that the viewpoints and spatial resolution. However, most of the existing methods focus on extracting fine-grained features and ignore the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Fangli Guan, Nan Zhao, Zhixiang Fang, Ling Jiang, Jianhui Zhang, Yue Yu, Haosheng Huang
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2025-01-01
Series:	Geo-spatial Information Science
Subjects:	Cross-view matching ConvNeXt satellite view drone view multilevel feature
Online Access:	https://www.tandfonline.com/doi/10.1080/10095020.2024.2439385
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832583382517678080
author	Fangli Guan Nan Zhao Zhixiang Fang Ling Jiang Jianhui Zhang Yue Yu Haosheng Huang
author_facet	Fangli Guan Nan Zhao Zhixiang Fang Ling Jiang Jianhui Zhang Yue Yu Haosheng Huang
author_sort	Fangli Guan
collection	DOAJ
description	Cross-view matching refers to the use of images from different platforms (e.g. drone and satellite views) to retrieve the most relevant images, where the key is that the viewpoints and spatial resolution. However, most of the existing methods focus on extracting fine-grained features and ignore the connection of contextual information in the image. Therefore, we propose a novel ConvNeXt-based multi-level representation learning model for the solution of this task. First, we extract global features through the ConvNeXt model. In order to obtain a joint part-based representation learning from the global features, we then replicated the obtained global features, operating one copy with spatial attention and the other copy using a standard convolutional operation. In addition, the features of different branches are aggregated through the multilevel feature fusion module to prepare for cross-view matching. Finally, we created a new hybrid loss function to better limit these features and assist in mining crucial data regarding global features. The experimental results indicate that we have achieved advanced performance on two common datasets, University-1652 and SUES-200 at 89.79% and 95.75% in drone target matching and 94.87% and 98.80 in drone navigation.
format	Article
id	doaj-art-e443d2f4417e498199513885deb5be12
institution	Kabale University
issn	1009-5020 1993-5153
language	English
publishDate	2025-01-01
publisher	Taylor & Francis Group
record_format	Article
series	Geo-spatial Information Science
spelling	doaj-art-e443d2f4417e498199513885deb5be122025-01-28T16:12:47ZengTaylor & Francis GroupGeo-spatial Information Science1009-50201993-51532025-01-0111410.1080/10095020.2024.2439385Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matchingFangli Guan0Nan Zhao1Zhixiang Fang2Ling Jiang3Jianhui Zhang4Yue Yu5Haosheng Huang6School of Computer Science, Hangzhou Dianzi University, Hangzhou, ChinaSchool of Computer Science, Hangzhou Dianzi University, Hangzhou, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, ChinaAnhui Province Key Laboratory of Physical Geographic Environment, Chuzhou University, Chuzhou, ChinaSchool of Computer Science, Hangzhou Dianzi University, Hangzhou, ChinaDepartment of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, ChinaDepartment of Geography, Ghent University, Ghent, BelgiumCross-view matching refers to the use of images from different platforms (e.g. drone and satellite views) to retrieve the most relevant images, where the key is that the viewpoints and spatial resolution. However, most of the existing methods focus on extracting fine-grained features and ignore the connection of contextual information in the image. Therefore, we propose a novel ConvNeXt-based multi-level representation learning model for the solution of this task. First, we extract global features through the ConvNeXt model. In order to obtain a joint part-based representation learning from the global features, we then replicated the obtained global features, operating one copy with spatial attention and the other copy using a standard convolutional operation. In addition, the features of different branches are aggregated through the multilevel feature fusion module to prepare for cross-view matching. Finally, we created a new hybrid loss function to better limit these features and assist in mining crucial data regarding global features. The experimental results indicate that we have achieved advanced performance on two common datasets, University-1652 and SUES-200 at 89.79% and 95.75% in drone target matching and 94.87% and 98.80 in drone navigation.https://www.tandfonline.com/doi/10.1080/10095020.2024.2439385Cross-view matchingConvNeXtsatellite viewdrone viewmultilevel feature
spellingShingle	Fangli Guan Nan Zhao Zhixiang Fang Ling Jiang Jianhui Zhang Yue Yu Haosheng Huang Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching Geo-spatial Information Science Cross-view matching ConvNeXt satellite view drone view multilevel feature
title	Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching
title_full	Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching
title_fullStr	Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching
title_full_unstemmed	Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching
title_short	Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching
title_sort	multi level representation learning via convnext based network for unaligned cross view matching
topic	Cross-view matching ConvNeXt satellite view drone view multilevel feature
url	https://www.tandfonline.com/doi/10.1080/10095020.2024.2439385
work_keys_str_mv	AT fangliguan multilevelrepresentationlearningviaconvnextbasednetworkforunalignedcrossviewmatching AT nanzhao multilevelrepresentationlearningviaconvnextbasednetworkforunalignedcrossviewmatching AT zhixiangfang multilevelrepresentationlearningviaconvnextbasednetworkforunalignedcrossviewmatching AT lingjiang multilevelrepresentationlearningviaconvnextbasednetworkforunalignedcrossviewmatching AT jianhuizhang multilevelrepresentationlearningviaconvnextbasednetworkforunalignedcrossviewmatching AT yueyu multilevelrepresentationlearningviaconvnextbasednetworkforunalignedcrossviewmatching AT haoshenghuang multilevelrepresentationlearningviaconvnextbasednetworkforunalignedcrossviewmatching

Multi-level representation learning via ConvNeXt-based network for unaligned cross-view matching

Similar Items