Optimistic policy optimization with bandit feedback

Yonathan Efroni*, Lior Shani*, Aviv Rosenberg, Shie Mannor

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Fingerprint

Dive into the research topics of 'Optimistic policy optimization with bandit feedback'. Together they form a unique fingerprint.

Mathematics

Engineering & Materials Science