Selective Reviews of Bandit Problems in AI via a Statistical View

Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes multi-armed bandit (MAB) and stochastic continuum-armed bandit (SCAB) problems, which model sequential...

Full description

Saved in:

Bibliographic Details
Main Authors:	Pengjie Zhou, Haoyu Wei, Huiming Zhang
Format:	Article
Language:	English
Published:	MDPI AG 2025-02-01
Series:	Mathematics
Subjects:	bandit problems exploration–exploitation concentration inequalities sub-Gaussian parameter estimation minimax rate functional data analysis
Online Access:	https://www.mdpi.com/2227-7390/13/4/665
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://www.mdpi.com/2227-7390/13/4/665

Selective Reviews of Bandit Problems in AI via a Statistical View

Internet

Similar Items