Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion

In company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the pre...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhen Feng, Xu Liu, Cheolkon Jung
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Versatile video coding compression artifact convolutional neural network dual-branch fast Fourier transform in-loop filter
Online Access:	https://ieeexplore.ieee.org/document/11016954/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850218717339713536
author	Zhen Feng Xu Liu Cheolkon Jung
author_facet	Zhen Feng Xu Liu Cheolkon Jung
author_sort	Zhen Feng
collection	DOAJ
description	In company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the predicted frame and partition frame represent compression artifacts by VTM. Since each has different types of information, directly concatenating all these inputs for the network may cause mutual interference between the input data and decrease the model performance. In this paper, we propose a dual-branch NNLF for VVC intra coding using spatial-frequency feature fusion. We design a dual-branch network for NNLF according to the input type: One branch includes reconstructed frames and QP maps, while the other branch includes reconstructed frame, predicted frame, and partition frame. The dual-branch architecture processes quantization information from the QP map and compression artifacts from the predicted frame and partition frame separately. Thus, the dual-branch architecture processes the input data efficiently according to its unique characteristics, while reducing mutual interference. Moreover, we adopt fast Fourier transform (FFT) to capture global context in a frame, instead of Transformer that has relatively high complexity. The spatial-frequency feature fusion combines features in spatial and frequency domains, which enhances feature representation capability and learns local and long-range feature correlations. Furthermore, we provide patch size-considered incremental learning based on QP distance that combines patch size and QP distance for network training. The training strategy enforces the proposed network to extend receptive field and learn both local features and global structure in a frame, thus enhancing its model adaptability. Experimental results show that the proposed NNLF achieves average BD-rate savings of {8.55% (Y), 20.48% (U), 21.44% (V)} over VTM-11.0_NNVC-3.0 in All Intra (AI) configuration.
format	Article
id	doaj-art-48c9d2f3d16f4cbb9bc2e06cb24dedb6
institution	OA Journals
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-48c9d2f3d16f4cbb9bc2e06cb24dedb62025-08-20T02:07:38ZengIEEEIEEE Access2169-35362025-01-0113989189893010.1109/ACCESS.2025.357383111016954Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature FusionZhen Feng0Xu Liu1https://orcid.org/0009-0006-2774-6082Cheolkon Jung2https://orcid.org/0000-0003-0299-7206School of Electronic Engineering, Xidian University, Xi’an, ChinaSchool of Electronic Engineering, Xidian University, Xi’an, ChinaSchool of Electronic Engineering, Xidian University, Xi’an, ChinaIn company with the reconstructed frame, the quantization parameter (QP) map, the predicted frame, and the partition frame are popularly used as auxiliary input for the neural network-based in-loop filter (NNLF). The QP map provides quantization information for the reconstructed frame, while the predicted frame and partition frame represent compression artifacts by VTM. Since each has different types of information, directly concatenating all these inputs for the network may cause mutual interference between the input data and decrease the model performance. In this paper, we propose a dual-branch NNLF for VVC intra coding using spatial-frequency feature fusion. We design a dual-branch network for NNLF according to the input type: One branch includes reconstructed frames and QP maps, while the other branch includes reconstructed frame, predicted frame, and partition frame. The dual-branch architecture processes quantization information from the QP map and compression artifacts from the predicted frame and partition frame separately. Thus, the dual-branch architecture processes the input data efficiently according to its unique characteristics, while reducing mutual interference. Moreover, we adopt fast Fourier transform (FFT) to capture global context in a frame, instead of Transformer that has relatively high complexity. The spatial-frequency feature fusion combines features in spatial and frequency domains, which enhances feature representation capability and learns local and long-range feature correlations. Furthermore, we provide patch size-considered incremental learning based on QP distance that combines patch size and QP distance for network training. The training strategy enforces the proposed network to extend receptive field and learn both local features and global structure in a frame, thus enhancing its model adaptability. Experimental results show that the proposed NNLF achieves average BD-rate savings of {8.55% (Y), 20.48% (U), 21.44% (V)} over VTM-11.0_NNVC-3.0 in All Intra (AI) configuration.https://ieeexplore.ieee.org/document/11016954/Versatile video codingcompression artifactconvolutional neural networkdual-branchfast Fourier transformin-loop filter
spellingShingle	Zhen Feng Xu Liu Cheolkon Jung Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion IEEE Access Versatile video coding compression artifact convolutional neural network dual-branch fast Fourier transform in-loop filter
title	Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_full	Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_fullStr	Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_full_unstemmed	Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_short	Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion
title_sort	dual branch neural network based in loop filter for vvc intra coding using spatial frequency feature fusion
topic	Versatile video coding compression artifact convolutional neural network dual-branch fast Fourier transform in-loop filter
url	https://ieeexplore.ieee.org/document/11016954/
work_keys_str_mv	AT zhenfeng dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion AT xuliu dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion AT cheolkonjung dualbranchneuralnetworkbasedinloopfilterforvvcintracodingusingspatialfrequencyfeaturefusion

Dual-Branch Neural Network-Based In-Loop Filter for VVC Intra Coding Using Spatial-Frequency Feature Fusion

Similar Items