A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning

Dialogue policy is a critical research area in human-computer interaction, vital for guiding dialogue generation and improving controllability and interpretability. Multi-agent dialogue policy learning demonstrates superior learning speed and exploration capabilities, positioning it as a promising a...

Full description

Saved in:
Bibliographic Details
Main Authors: Songfeng Liang, Kai Xu, Zhurong Dong
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10840219/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832590313584066560
author Songfeng Liang
Kai Xu
Zhurong Dong
author_facet Songfeng Liang
Kai Xu
Zhurong Dong
author_sort Songfeng Liang
collection DOAJ
description Dialogue policy is a critical research area in human-computer interaction, vital for guiding dialogue generation and improving controllability and interpretability. Multi-agent dialogue policy learning demonstrates superior learning speed and exploration capabilities, positioning it as a promising approach for developing more effective and adaptive dialogue agents. However, many studies neglect to holistically model collaboration between agents, which limits the effectiveness of policy learning. Therefore, this paper proposes a new multi-agent group collaboration mechanism for dialogue policy learning, named GMPL. Concretely, we employ an Actor-Critic network to implement the proposed model, alternately updating individual dialogue agents to optimize policy selection. In each update, we utilize the maximum action value function to determine the appropriate dialogue action, while the maximum state value function serves to guide the policy learning process. This integrated approach ensures that both decision-making and learning phases are effectively aligned, thereby enhancing the overall performance of the dialogue agents. Furthermore, we conduct a theoretical analysis of the convergence properties of the proposed model. Experiments were conducted on two distinct task-oriented dialogue datasets, revealing that the proposed multi-agent model exhibits a significantly faster learning speed and a higher dialogue success rate compared to baseline approaches.
format Article
id doaj-art-28d8c72f1e974ebe8c81c7babbe6348d
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-28d8c72f1e974ebe8c81c7babbe6348d2025-01-24T00:01:43ZengIEEEIEEE Access2169-35362025-01-0113117541176410.1109/ACCESS.2025.352946910840219A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy LearningSongfeng Liang0Kai Xu1https://orcid.org/0000-0001-8933-6704Zhurong Dong2School of Automotive and Transportation Engineering, Shenzhen Polytechnic University, Shenzhen, ChinaSchool of Software Engineering, South China University of Technology, Guangzhou, ChinaSchool of Automotive and Transportation Engineering, Shenzhen Polytechnic University, Shenzhen, ChinaDialogue policy is a critical research area in human-computer interaction, vital for guiding dialogue generation and improving controllability and interpretability. Multi-agent dialogue policy learning demonstrates superior learning speed and exploration capabilities, positioning it as a promising approach for developing more effective and adaptive dialogue agents. However, many studies neglect to holistically model collaboration between agents, which limits the effectiveness of policy learning. Therefore, this paper proposes a new multi-agent group collaboration mechanism for dialogue policy learning, named GMPL. Concretely, we employ an Actor-Critic network to implement the proposed model, alternately updating individual dialogue agents to optimize policy selection. In each update, we utilize the maximum action value function to determine the appropriate dialogue action, while the maximum state value function serves to guide the policy learning process. This integrated approach ensures that both decision-making and learning phases are effectively aligned, thereby enhancing the overall performance of the dialogue agents. Furthermore, we conduct a theoretical analysis of the convergence properties of the proposed model. Experiments were conducted on two distinct task-oriented dialogue datasets, revealing that the proposed multi-agent model exhibits a significantly faster learning speed and a higher dialogue success rate compared to baseline approaches.https://ieeexplore.ieee.org/document/10840219/Human-computer interactiondialogue policy learningdeep reinforcement learningmulti-agent learning
spellingShingle Songfeng Liang
Kai Xu
Zhurong Dong
A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
IEEE Access
Human-computer interaction
dialogue policy learning
deep reinforcement learning
multi-agent learning
title A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
title_full A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
title_fullStr A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
title_full_unstemmed A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
title_short A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
title_sort multi agent approach to modeling task oriented dialog policy learning
topic Human-computer interaction
dialogue policy learning
deep reinforcement learning
multi-agent learning
url https://ieeexplore.ieee.org/document/10840219/
work_keys_str_mv AT songfengliang amultiagentapproachtomodelingtaskorienteddialogpolicylearning
AT kaixu amultiagentapproachtomodelingtaskorienteddialogpolicylearning
AT zhurongdong amultiagentapproachtomodelingtaskorienteddialogpolicylearning
AT songfengliang multiagentapproachtomodelingtaskorienteddialogpolicylearning
AT kaixu multiagentapproachtomodelingtaskorienteddialogpolicylearning
AT zhurongdong multiagentapproachtomodelingtaskorienteddialogpolicylearning