Effective Off-Policy Evaluation and Learning in Contextual Combinatorial Bandits
Published in Proceedings of the 18th ACM Conference on Recommender Systemse, 2024
Recommended citation: Shimizu, Tatsuhiro, Koichi Tanaka, Ren Kishimoto, Haruka Kiyohara, Masahiro Nomura, and Yuta Saito. "Effective Off-Policy Evaluation and Learning in Contextual Combinatorial Bandits." In Proceedings of the 18th ACM Conference on Recommender Systems, pp. 733-741. 2024. http://tatsu432.github.io/files/OPCB.pdf