COM-MABs: From Users' Feedback to Recommendation

Recently, the COMbinatorial Multi-Armed Bandits (COM-MAB) problem has arisen as an active research field. In systems interacting with humans, those reinforcement learning approaches use a feedback strategy as their reward function. On the study of those strategies, this paper present three contribut...

Full description

Saved in:

Bibliographic Details
Main Authors:	Alexandre Letard, Tassadit Amghar, Olivier Camp, Nicolas Gutowski
Format:	Article
Language:	English
Published:	LibraryPress@UF 2022-05-01
Series:	Proceedings of the International Florida Artificial Intelligence Research Society Conference
Online Access:	https://journals.flvc.org/FLAIRS/article/view/130560
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recently, the COMbinatorial Multi-Armed Bandits (COM-MAB) problem has arisen as an active research field. In systems interacting with humans, those reinforcement learning approaches use a feedback strategy as their reward function. On the study of those strategies, this paper present three contributions: 1) We model a feedback strategy as a three-step process, where each step influences the performances of an agent ; 2) Based on this model, we propose a novel Reward Computing process, BUSBC, which significantly increases the global accuracy reached by optimistic COM-MAB algorithms -- up to 16.2\% -- ; 3) We conduct an empirical analysis of our approach and several feedback strategies from the literature on three real-world application datasets, confirming our propositions.
ISSN:	2334-0754 2334-0762

COM-MABs: From Users' Feedback to Recommendation

Similar Items