SGD-TripleQNet: An Integrated Deep Reinforcement Learning Model for Vehicle Lane-Change Decision

With the advancement of autonomous driving technology, vehicle lane-change decision (LCD) has become a critical issue for improving driving safety and efficiency. Traditional deep reinforcement learning (DRL) methods face challenges such as slow convergence, unstable decisions, and low accuracy when...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang Liu, Tianxing Yang, Liwei Tian, Jianbiao Pei
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/2/235
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the advancement of autonomous driving technology, vehicle lane-change decision (LCD) has become a critical issue for improving driving safety and efficiency. Traditional deep reinforcement learning (DRL) methods face challenges such as slow convergence, unstable decisions, and low accuracy when dealing with complex traffic environments. To address these issues, this paper proposes a novel integrated deep reinforcement learning model called “SGD-TripleQNet” for autonomous vehicle lane-change decision-making. This method integrates three types of deep Q-learning networks (DQN, DDQN, and Dueling DDQN) and uses the Stochastic Gradient Descent (SGD) optimization algorithm to dynamically adjust the network weights. This dynamic weight adjustment process fine-tunes the weights based on gradient information to minimize the target loss function. The approach effectively addresses key challenges in autonomous driving lane-change decisions, including slow convergence, low accuracy, and unstable decision-making. The experiment shows that the proposed method, SGD-TripleQNet, has significant advantages over single models: In terms of convergence speed, it is approximately 25% faster than DQN, DDQN, and Dueling DDQN, achieving stability within 150 epochs; in terms of decision stability, the Q-value fluctuations are reduced by about 40% in the later stages of training; in terms of final performance, the average reward exceeds that of DQN (by 6.85%), DDQN (by 6.86%), and Dueling DDQN (by 6.57%), confirming the effectiveness of the proposed method. It also provides a theoretical foundation and practical guidance for the design and optimization of future autonomous driving systems.
ISSN:2227-7390