Research

Papers and Preprints

  1. Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF
    Wei Xiong*, Hanze Dong*, Chenlu Ye*, Han Zhong, Nan Jiang, Tong Zhang, Preprint.

  2. Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
    Jianqing Fan*, Zhaoran Wang*, Zhuoran Yang*, Chenlu Ye*, Preprint.

  3. Corruption-Robust Offline Reinforcement Learning with General Function Approximation
    Chenlu Ye*, Rui Yang*, Quanquan Gu and Tong Zhang, NeurIPS 2023.

  4. Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
    Yong Lin*, Chen Liu*, Chenlu Ye*, Qing Lian, Yuan Yao and Tong Zhang, Preprint.

  5. Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
    Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.

(* equal contribution or alphabetical order)