Depth-Induced Intra-to-Inter Transformer network for stereoscopic image retargeting
With the advancement of three-dimension visual applications, stereoscopic image editing technologies have attracted widespread popularity in both industry and entertainment. In this paper, we focus on the fundamental stereoscopic image content editing problem, i.e. stereoscopic image retargeting, wh...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-04-01
|
| Series: | Engineering Science and Technology, an International Journal |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2215098625000849 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | With the advancement of three-dimension visual applications, stereoscopic image editing technologies have attracted widespread popularity in both industry and entertainment. In this paper, we focus on the fundamental stereoscopic image content editing problem, i.e. stereoscopic image retargeting, which aims to transform stereoscopic images to specific resolution with prescribed aspect ratios adaptively. Due to the additional binocular information present between the left and right views in stereoscopic images, the CNN-based stereoscopic image retargeting methods have some obvious limitations in capturing long-range dependencies. To address these issues, we present a depth-induced intra-to-inter Transformer network (DITrans-Net) for stereoscopic image retargeting, which learns the long-range dependencies information between intra-view and inter-view by an intra-to-inter feature extraction module and aggregates the depth information of left view and right view by a depth-induced feature integration module. Specifically, an intra-to-inter feature extraction module exploits intra-to-inter Transformer blocks for long-range dependencies information extraction firstly. Furthermore, a depth-induced feature integration module employs disparity attention learning mechanism to learn stereo correspondence and enhance disparity varying consistency. Finally, a hybrid loss function is applied to improve the stereoscopic image retargeting quality. Extensive experiments demonstrate that the proposed DITrans-Net achieves significant improvements and outperforms state-of-the-art methods both quantitatively and qualitatively on the various benchmark datasets. |
|---|---|
| ISSN: | 2215-0986 |