-
301
Dual-branch attention network-based stereoscopicvideo compression
Published 2025-01-01“…First, a Local and Global Encoder-decoder Block (LGEDB) based on Transformer and channel attention was proposed, which accurately captured non-repetitive texture details in local regions and global structural information by integrating pixel-level self-attention within each local area and global attention across channels. …”
Get full text
Article -
302
A Dual-Stream Dental Panoramic X-Ray Image Segmentation Method Based on Transformer Heterogeneous Feature Complementation
Published 2025-07-01“…Furthermore, a Pooling-Cooperative Convolutional Module was designed, which enhances the model’s capability in detail extraction and boundary localization through weighted centroid features of dental structures and a latent edge extraction module. …”
Get full text
Article -
303
Fusion of Recurrence Plots and Gramian Angular Fields with Bayesian Optimization for Enhanced Time-Series Classification
Published 2025-07-01“…Time-series classification remains a critical task across various domains, demanding models that effectively capture both local recurrence structures and global temporal dependencies. We introduce a novel framework that transforms time series into image representations by fusing recurrence plots (RPs) with both Gramian Angular Summation Fields (GASFs) and Gramian Angular Difference Fields (GADFs). …”
Get full text
Article -
304
Research on SeaTreasure Target Detection Technology Based on Improved YOLOv7-Tiny
Published 2025-01-01“…First, based on the YOLOv7-Tiny network, the MAFPN neck structure is used to replace the ELAN structure to achieve the multi-scale capture of semantic information of underwater sea treasures, and to enhance the UPA-YOLO model to accurately locate the targets of underwater sea treasures; second, the P2ELAN module is constructed and added to the backbone network, which makes use of the redundancy information in the feature map and dynamically adjusts the convolution kernel to adapt to data The P2ELAN module is added to the backbone network, using the redundant information in the feature map, dynamically adjusting the convolutional kernel to adapt to the lack of data, reducing the number of parameters in the model, and introducing the MSCA attention mechanism to inhibit the complex and changeable background features underwater, to improve the semantic feature extraction ability of the UPA-YOLO model for underwater targets, adding the MPDiou loss function to the improved algorithm model and completing the data validation of the detection model; finally, based on the TensorRT acceleration framework, the optimisation of the target detection Finally, based on the TensorRT acceleration framework, the target detection model is optimised, and the Jetson Nano edge device is used to complete the localisation deployment and realise the real-time target detection task of underwater sea treasures. …”
Get full text
Article -
305
Identification of diabetic retinopathy lesions in fundus images by integrating CNN and vision mamba models.
Published 2025-01-01“…The majority of deep learning techniques developed for medical image analysis rely on convolutional modules to extract the inherent structure of images within a certain local receptive field. …”
Get full text
Article -
306
An Mcformer encoder integrating Mamba and Cgmlp for improved acoustic feature extraction
Published 2025-07-01“…To address this limitation, the Mcformer encoder is introduced, which incorporates the Mamba module in parallel with multi-head attention blocks to enhance the model’s global context processing capabilities. Additionally, a Convolutional Gated Multilayer Perceptron (Cgmlp) structure is employed to improve the extraction of local features through deep convolutional layers. …”
Get full text
Article -
307
Efficient Image Super-Resolution With Multi-Branch Mixer Transformer
Published 2025-03-01“…To address these problems, we propose a Multi-Branch Token Mixer (MBTM) to extract richer global and local information. Compared to other Transformer-based SR networks, MBTM achieves a balance between capturing global information and reducing the computational complexity of self-attention through its compact multi-branch structure. …”
Get full text
Article -
308
Distributed Photovoltaic Short-Term Power Prediction Based on Personalized Federated Multi-Task Learning
Published 2025-04-01“…By improving the parallel pooling structure of a time series convolution network (TCN), an improved time series convolution network (iTCN) prediction model was established, and the channel attention mechanism CBAMANet was added to highlight the key meteorological characteristics’ information and improve the feature extraction ability of time series data in photovoltaic power prediction. …”
Get full text
Article -
309
Improved Asynchronous Federated Learning for Data Injection Pollution
Published 2025-05-01“…In our approach, the residual network is used to extract the static information of the image, the capsule network is used to extract the spatial dependence among the internal structures of the image, several layers of convolution are used to reduce the dimensions of both features, and the two extracted features are fused. …”
Get full text
Article -
310
Bearing fault diagnosis based on efficient cross space multiscale CNN transformer parallelism
Published 2025-04-01“…Subsequently, parallel branches are employed to extract spatio-temporal features: the Convolutional Neural Network (CNN) branch integrates a multiscale feature extraction module, a Reversed Residual Structure (RRS), and an Efficient Multiscale Attention (EMA) mechanism to enhance local and global feature extraction capabilities; the Transformer branch combines Bidirectional Gated Recurrent Units (BiGRU) and Transformer to capture both local temporal dynamics and long-term dependencies. …”
Get full text
Article -
311
Infrared object detection for robot vision based on multiple focus diffusion and task interaction alignment
Published 2025-07-01“…The feature extraction module adopts a dual-stream fusion structure in the backbone network, which combines the local feature extraction of CNN with the global feature modeling of transformer. …”
Get full text
Article -
312
AfaMamba: Adaptive Feature Aggregation With Visual State Space Model for Remote Sensing Images Semantic Segmentation
Published 2025-01-01“…It employs a lightweight ResNet18 as the encoder, and during the decoding phase, it first utilizes a multiscale feature adaptive aggregation module to ensure that the output features from each stage of the encoder contain rich multiscale semantic information. Subsequently, the global-local Mamba structure combines the attention-optimized multiscale convolutional branches with the global branch of Mamba to facilitate effective interaction between global and local features. …”
Get full text
Article -
313
Vision Mamba and xLSTM-UNet for medical image segmentation
Published 2025-03-01“…Abstract Deep learning-based medical image segmentation methods are generally divided into convolutional neural networks (CNNs) and Transformer-based models. …”
Get full text
Article -
314
YOLO-HVS: Infrared Small Target Detection Inspired by the Human Visual System
Published 2025-07-01“…Meanwhile, the C2f_DWR (dilation-wise residual) module with regional-semantic dual residual structure is designed to significantly improve the efficiency of capturing multi-scale contextual information by expanding convolution and two-step feature extraction mechanism. …”
Get full text
Article -
315
Fine-Grained Extraction of Coastal Aquaculture Ponds From Remote Sensing Images Using an Edge-Supervised Multi-task Neural Network
Published 2025-01-01“…It notably enhances performance in complex environments and significantly boosts generalization capabilities by learning global structural features. First, a shared encoder–decoder architecture was constructed, leveraging large kernel depthwise separable convolution and residual optimization, thereby enhancing both local and global feature representations. …”
Get full text
Article -
316
A small object detection model in aerial images based on CPDD-YOLOv8
Published 2025-01-01“…Thirdly, a new DSC2f structure is proposed, which uses Dynamic Snake Convolution (DSConv) to take the place of the first standard Conv of Bottleneck in the C2f structure, so that the model can adapt to different inputs more effectively. …”
Get full text
Article -
317
A lightweight high-frequency mamba network for image super-resolution
Published 2025-07-01“…Various methods based on convolutional neural network (CNN) and Transformer structures have emerged, but few studies have mentioned how to combine these two parts of information. …”
Get full text
Article -
318
StomaYOLO: A Lightweight Maize Phenotypic Stomatal Cell Detector Based on Multi-Task Training
Published 2025-07-01“…Maize (<i>Zea mays</i> L.), a vital global food crop, relies on its stomatal structure for regulating photosynthesis and responding to drought. …”
Get full text
Article -
319
ST-AGRNN: A Spatio-Temporal Attention-Gated Recurrent Neural Network for Traffic State Forecasting
Published 2022-01-01“…In the proposed model, structure-based and location-based localized spatial features are obtained simultaneously by Graph Convolutional Networks (GCNs) and DeepWalk. …”
Get full text
Article -
320
Bitemporal Remote Sensing Change Detection With State-Space Models
Published 2025-01-01“…Change detection in very-high-resolution remote sensing images has gained significant attention, particularly with the rise of deep learning techniques such as convolutional neural networks and Transformers. The Mamba structure, successful in computer vision, has been applied to this domain, enhancing computational efficiency. …”
Get full text
Article