Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
In company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the pre...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11016954/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850218717339713536 |
|---|---|
| author | Zhen Feng Xu Liu Cheolkon Jung |
| author_facet | Zhen Feng Xu Liu Cheolkon Jung |
| author_sort | Zhen Feng |
| collection | DOAJ |
| description | In company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the predicted frame and partition frame represent compression artifacts by VTM. Since each has different types of information, directly concatenating all these inputs for the network may cause mutual interference between the input data and decrease the model performance. In this paper, we propose a dual-branch NNLF for VVC intra coding using spatial-frequency feature fusion. We design a dual-branch network for NNLF according to the input type: One branch includes reconstructed frames and QP maps, while the other branch includes reconstructed frame, predicted frame, and partition frame. The dual-branch architecture processes quantization information from the QP map and compression artifacts from the predicted frame and partition frame separately. Thus, the dual-branch architecture processes the input data efficiently according to its unique characteristics, while reducing mutual interference. Moreover, we adopt fast Fourier transform (FFT) to capture global context in a frame, instead of Transformer that has relatively high complexity. The spatial-frequency feature fusion combines features in spatial and frequency domains, which enhances feature representation capability and learns local and long-range feature correlations. Furthermore, we provide patch size-considered incremental learning based on QP distance that combines patch size and QP distance for network training. The training strategy enforces the proposed network to extend receptive field and learn both local features and global structure in a frame, thus enhancing its model adaptability. Experimental results show that the proposed NNLF achieves average BD-rate savings of {8.55% (Y), 20.48% (U), 21.44% (V)} over VTM-11.0_NNVC-3.0 in All Intra (AI) configuration. |
| format | Article |
| id | doaj-art-48c9d2f3d16f4cbb9bc2e06cb24dedb6 |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-48c9d2f3d16f4cbb9bc2e06cb24dedb62025-08-20T02:07:38ZengIEEEIEEE Access2169-35362025-01-0113989189893010.1109/ACCESS.2025.357383111016954Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature FusionZhen Feng0Xu Liu1https://orcid.org/0009-0006-2774-6082Cheolkon Jung2https://orcid.org/0000-0003-0299-7206School of Electronic Engineering, Xidian University, Xi’an, ChinaSchool of Electronic Engineering, Xidian University, Xi’an, ChinaSchool of Electronic Engineering, Xidian University, Xi’an, ChinaIn company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the predicted frame and partition frame represent compression artifacts by VTM. Since each has different types of information, directly concatenating all these inputs for the network may cause mutual interference between the input data and decrease the model performance. In this paper, we propose a dual-branch NNLF for VVC intra coding using spatial-frequency feature fusion. We design a dual-branch network for NNLF according to the input type: One branch includes reconstructed frames and QP maps, while the other branch includes reconstructed frame, predicted frame, and partition frame. The dual-branch architecture processes quantization information from the QP map and compression artifacts from the predicted frame and partition frame separately. Thus, the dual-branch architecture processes the input data efficiently according to its unique characteristics, while reducing mutual interference. Moreover, we adopt fast Fourier transform (FFT) to capture global context in a frame, instead of Transformer that has relatively high complexity. The spatial-frequency feature fusion combines features in spatial and frequency domains, which enhances feature representation capability and learns local and long-range feature correlations. Furthermore, we provide patch size-considered incremental learning based on QP distance that combines patch size and QP distance for network training. The training strategy enforces the proposed network to extend receptive field and learn both local features and global structure in a frame, thus enhancing its model adaptability. Experimental results show that the proposed NNLF achieves average BD-rate savings of {8.55% (Y), 20.48% (U), 21.44% (V)} over VTM-11.0_NNVC-3.0 in All Intra (AI) configuration.https://ieeexplore.ieee.org/document/11016954/Versatile video codingcompression artifactconvolutional neural networkdual-branchfast Fourier transformin-loop filter |
| spellingShingle | Zhen Feng Xu Liu Cheolkon Jung Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion IEEE Access Versatile video coding compression artifact convolutional neural network dual-branch fast Fourier transform in-loop filter |
| title | Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion |
| title_full | Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion |
| title_fullStr | Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion |
| title_full_unstemmed | Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion |
| title_short | Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion |
| title_sort | dual branch neural network based in loop filter for vvc intra coding using spatial frequency feature fusion |
| topic | Versatile video coding compression artifact convolutional neural network dual-branch fast Fourier transform in-loop filter |
| url | https://ieeexplore.ieee.org/document/11016954/ |
| work_keys_str_mv | AT zhenfeng dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion AT xuliu dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion AT cheolkonjung dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion |