Follow
Chenlu Ye
Title
Cited by
Cited by
Year
Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes
C Ye, W Xiong, Q Gu, T Zhang
International Conference on Machine Learning, 39834-39863, 2023
192023
Gibbs sampling from human feedback: A provable kl-constrained framework for rlhf
W Xiong, H Dong, C Ye, H Zhong, N Jiang, T Zhang
arXiv preprint arXiv:2312.11456, 2023
132023
Corruption-Robust Offline Reinforcement Learning with General Function Approximation
C Ye, R Yang, Q Gu, T Zhang
Neural Information Processing Systems, 2023
52023
Iterative preference learning from human feedback: Bridging theory and practice for RLHF under KL-constraint
W Xiong, H Dong, C Ye, Z Wang, H Zhong, H Ji, N Jiang, T Zhang
ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation …, 2023
42023
A theoretical analysis of nash learning from human feedback under general kl-regularized preference
C Ye, W Xiong, Y Zhang, N Jiang, T Zhang
arXiv preprint arXiv:2402.07314, 2024
32024
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
C Ye, J He, Q Gu, T Zhang
arXiv preprint arXiv:2402.08991, 2024
12024
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Y Lin, C Liu, C Ye, Q Lian, Y Yao, T Zhang
arXiv preprint arXiv:2309.02476, 2023
12023
Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
J Fan, Z Wang, Z Yang, C Ye
arXiv preprint arXiv:2311.13180, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–8