Research

Papers and Preprints

A theoretical analysis of nash learning from human feedback under general kl-regularized preference
Chenlu Ye*, Wei Xiong*, Yuheng Zhang*, Hanze Dong*, Nan Jiang, Tong Zhang, NeurIPS 2024.
Towards robust model-based reinforcement learning against adversarial corruption
Chenlu Ye*, Jiafan He*, Quanquan Gu, Tong Zhang, ICML 2024. An analysis of uncertainty-aware algorithms in the model-based framework under adversarial corruption and general function approximation.
Iterative preference learning from human feedback: Bridging theory and practice for rlhf under kl-constraint
Wei Xiong*, Hanze Dong*, Chenlu Ye*, Han Zhong, Nan Jiang, Tong Zhang, ICML 2024.
Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
Jianqing Fan*, Zhaoran Wang*, Zhuoran Yang*, Chenlu Ye*, Preprint.
Corruption-Robust Offline Reinforcement Learning with General Function Approximation
Chenlu Ye*, Rui Yang*, Quanquan Gu and Tong Zhang, NeurIPS 2023.
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Yong Lin*, Chen Liu*, Chenlu Ye*, Qing Lian, Yuan Yao and Tong Zhang, Preprint.
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.

(* equal contribution or alphabetical order)