Thompson Sampling for Non-Stationary Bandit Problems

Non-stationary multi-armed bandit (MAB) problems have recently attracted extensive attention. We focus on the abruptly changing scenario where reward distributions remain constant for a certain period and change at unknown time steps. Although Thompson sampling (TS) has shown success in non-stationa...

Full description

Saved in:

Bibliographic Details
Main Authors:	Han Qi, Fei Guo, Li Zhu
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Entropy
Subjects:	multi-armed bandits Thompson sampling non-stationary
Online Access:	https://www.mdpi.com/1099-4300/27/1/51
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!

Thompson Sampling for Non-Stationary Bandit Problems

Similar Items