Research

Papers and Preprints

  1. A theoretical analysis of nash learning from human feedback under general kl-regularized preference
    Chenlu Ye*, Wei Xiong*, Yuheng Zhang*, Hanze Dong*, Nan Jiang, Tong Zhang, NeurIPS 2024.

  2. Towards robust model-based reinforcement learning against adversarial corruption
    Chenlu Ye*, Jiafan He*, Quanquan Gu, Tong Zhang, ICML 2024. An analysis of uncertainty-aware algorithms in the model-based framework under adversarial corruption and general function approximation.

  3. Iterative preference learning from human feedback: Bridging theory and practice for rlhf under kl-constraint
    Wei Xiong*, Hanze Dong*, Chenlu Ye*, Han Zhong, Nan Jiang, Tong Zhang, ICML 2024.

  4. Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
    Jianqing Fan*, Zhaoran Wang*, Zhuoran Yang*, Chenlu Ye*, Preprint.

  5. Corruption-Robust Offline Reinforcement Learning with General Function Approximation
    Chenlu Ye*, Rui Yang*, Quanquan Gu and Tong Zhang, NeurIPS 2023.

  6. Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
    Yong Lin*, Chen Liu*, Chenlu Ye*, Qing Lian, Yuan Yao and Tong Zhang, Preprint.

  7. Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
    Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.

(* equal contribution or alphabetical order)