Follow
Zhang Zihan
Title
Cited by
Cited by
Year
Almost optimal model-free reinforcement learningvia reference-advantage decomposition
Z Zhang, Y Zhou, X Ji
Advances in Neural Information Processing Systems 33, 15198-15207, 2020
872020
Is reinforcement learning more difficult than bandits? a near-optimal algorithm escaping the curse of horizon
Z Zhang, X Ji, S Du
Conference on Learning Theory, 4528-4531, 2021
562021
Regret minimization for reinforcement learning by evaluating the optimal bias function
Z Zhang, X Ji
Advances in Neural Information Processing Systems 32, 2019
472019
Near optimal reward-free reinforcement learning
Z Zhang, S Du, X Ji
International Conference on Machine Learning, 12402-12412, 2021
25*2021
Improved variance-aware confidence sets for linear bandits and linear mixture mdp
Z Zhang, J Yang, X Ji, SS Du
Advances in Neural Information Processing Systems 34, 4342-4355, 2021
22*2021
Model-free reinforcement learning: from clipped pseudo-regret to sample complexity
Z Zhang, Y Zhou, X Ji
International Conference on Machine Learning, 12653-12662, 2021
202021
Horizon-free reinforcement learning in polynomial time: the power of stationary policies
Z Zhang, X Ji, S Du
Conference on Learning Theory, 3858-3904, 2022
12022
Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits
Z Zhang, X Ji, Y Zhou
arXiv preprint arXiv:2110.08057, 2021
2021
The system can't perform the operation now. Try again later.
Articles 1–8