Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion

In company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the pre...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhen Feng, Xu Liu, Cheolkon Jung
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11016954/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850218717339713536
author Zhen Feng
Xu Liu
Cheolkon Jung
author_facet Zhen Feng
Xu Liu
Cheolkon Jung
author_sort Zhen Feng
collection DOAJ
description In company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the predicted frame and partition frame represent compression artifacts by VTM. Since each has different types of information, directly concatenating all these inputs for the network may cause mutual interference between the input data and decrease the model performance. In this paper, we propose a dual-branch NNLF for VVC intra coding using spatial-frequency feature fusion. We design a dual-branch network for NNLF according to the input type: One branch includes reconstructed frames and QP maps, while the other branch includes reconstructed frame, predicted frame, and partition frame. The dual-branch architecture processes quantization information from the QP map and compression artifacts from the predicted frame and partition frame separately. Thus, the dual-branch architecture processes the input data efficiently according to its unique characteristics, while reducing mutual interference. Moreover, we adopt fast Fourier transform (FFT) to capture global context in a frame, instead of Transformer that has relatively high complexity. The spatial-frequency feature fusion combines features in spatial and frequency domains, which enhances feature representation capability and learns local and long-range feature correlations. Furthermore, we provide patch size-considered incremental learning based on QP distance that combines patch size and QP distance for network training. The training strategy enforces the proposed network to extend receptive field and learn both local features and global structure in a frame, thus enhancing its model adaptability. Experimental results show that the proposed NNLF achieves average BD-rate savings of {8.55% (Y), 20.48% (U), 21.44% (V)} over VTM-11.0_NNVC-3.0 in All Intra (AI) configuration.
format Article
id doaj-art-48c9d2f3d16f4cbb9bc2e06cb24dedb6
institution OA Journals
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-48c9d2f3d16f4cbb9bc2e06cb24dedb62025-08-20T02:07:38ZengIEEEIEEE Access2169-35362025-01-0113989189893010.1109/ACCESS.2025.357383111016954Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature FusionZhen Feng0Xu Liu1https://orcid.org/0009-0006-2774-6082Cheolkon Jung2https://orcid.org/0000-0003-0299-7206School of Electronic Engineering, Xidian University, Xi’an, ChinaSchool of Electronic Engineering, Xidian University, Xi’an, ChinaSchool of Electronic Engineering, Xidian University, Xi’an, ChinaIn company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the predicted frame and partition frame represent compression artifacts by VTM. Since each has different types of information, directly concatenating all these inputs for the network may cause mutual interference between the input data and decrease the model performance. In this paper, we propose a dual-branch NNLF for VVC intra coding using spatial-frequency feature fusion. We design a dual-branch network for NNLF according to the input type: One branch includes reconstructed frames and QP maps, while the other branch includes reconstructed frame, predicted frame, and partition frame. The dual-branch architecture processes quantization information from the QP map and compression artifacts from the predicted frame and partition frame separately. Thus, the dual-branch architecture processes the input data efficiently according to its unique characteristics, while reducing mutual interference. Moreover, we adopt fast Fourier transform (FFT) to capture global context in a frame, instead of Transformer that has relatively high complexity. The spatial-frequency feature fusion combines features in spatial and frequency domains, which enhances feature representation capability and learns local and long-range feature correlations. Furthermore, we provide patch size-considered incremental learning based on QP distance that combines patch size and QP distance for network training. The training strategy enforces the proposed network to extend receptive field and learn both local features and global structure in a frame, thus enhancing its model adaptability. Experimental results show that the proposed NNLF achieves average BD-rate savings of {8.55% (Y), 20.48% (U), 21.44% (V)} over VTM-11.0_NNVC-3.0 in All Intra (AI) configuration.https://ieeexplore.ieee.org/document/11016954/Versatile video codingcompression artifactconvolutional neural networkdual-branchfast Fourier transformin-loop filter
spellingShingle Zhen Feng
Xu Liu
Cheolkon Jung
Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
IEEE Access
Versatile video coding
compression artifact
convolutional neural network
dual-branch
fast Fourier transform
in-loop filter
title Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_full Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_fullStr Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_full_unstemmed Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_short Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_sort dual branch neural network based in loop filter for vvc intra coding using spatial frequency feature fusion
topic Versatile video coding
compression artifact
convolutional neural network
dual-branch
fast Fourier transform
in-loop filter
url https://ieeexplore.ieee.org/document/11016954/
work_keys_str_mv AT zhenfeng dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion
AT xuliu dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion
AT cheolkonjung dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion