Science Cast

Symphony of experts: orchestration with adversarial insights in reinforcement learning

Matthieu JonckheereOctober 26, 2023 7:11am

Views (47)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

Symphony of experts: orchestration with adversarial insights in reinforcement learning

arXivPDFOctober 25, 2023 12:00am

Authors

Matthieu Jonckheere LAAS, Chiara Mignacco LMO, CELESTE, Gilles Stoltz LMO, CELESTE

Abstract

Structured reinforcement learning leverages policies with advantageous properties to reach better performance, particularly in scenarios where exploration poses challenges. We explore this field through the concept of orchestration, where a (small) set of expert policies guides decision-making; the modeling thereof constitutes our first contribution. We then establish value-functions regret bounds for orchestration in the tabular setting by transferring regret-bound results from adversarial settings. We generalize and extend the analysis of natural policy gradient in Agarwal et al. [2021, Section 5.3] to arbitrary adversarial aggregation strategies. We also extend it to the case of estimated advantage functions, providing insights into sample complexity both in expectation and high probability. A key point of our approach lies in its arguably more transparent proofs compared to existing methods. Finally, we present simulations for a stochastic matching toy model.

TwitterandLinkedIn

0 comments

Add comment

Symphony of experts: orchestration with adversarial insights in reinforcement learning

Symphony of experts: orchestration with adversarial insights in reinforcement learning

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments