Active Offline Policy Selection

A-OPS

A sequential decision approach that combines logged data with online interaction to identify the best policy.
PaperReview
ReinforcementLearning
Author

Chanseok Kang

Published

October 19, 2023