Selective Reviews of Bandit Problems in AI via a Statistical View

Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment. A key subset includes multi-armed bandit (MAB) and stochastic continuum-armed bandit (SCAB) problems, which model sequential...

Full description

Saved in:
Bibliographic Details
Main Authors: Pengjie Zhou, Haoyu Wei, Huiming Zhang
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/4/665
Tags: Add Tag
No Tags, Be the first to tag this record!