Chenlu Ye


PhD student
The Hong Kong University of Science and Technology

[Curriculum Vitae]      [Google Scholar]

I am a third-year MPhil student in IIP (AI) at The Hong Kong University of Science and Technology, and I am fortunate to be advised by Prof. Tong Zhang, Prof. Kani Chen, and Prof. Yuan Yao. Prior to this, I received a B.S. in Statistics from the University of Science and Technology of China in 2021. Additionally, I was a visiting scholar in AGI LAB @ UCLA, working with Prof. Quanquan Gu.

Research Interests

My research interests span the intersection of machine learning theory and statistics, with a particular emphasis on the foundations of reinforcement learning.

If you are interested in discussing or collaborating, please feel free to contact me via email: cyeab AT connect DOT ust DOT hk.


I am visiting AGI LAB @ UCLA from August 2023 to January 2024.

One paper was accepted by NeurIPS 2023!

Publications and Preprints

(α-β) means that the order is decided by rock–paper–scissors and (*) denotes equal contribution.

  1. Gibbs Sampling from Human Feedback: A Provable KL-constrained Framework for RLHF
    Wei Xiong*, Hanze Dong*, Chenlu Ye*, Han Zhong, Nan Jiang, Tong Zhang, Preprint.
    We formulate the real-world RLHF process as a reverse-KL regularized contextual bandits and study its theoretical property by proposing statistically efficient algorithms with finite-sample theoretical guarantee. We also connect our theoretical findings with practical algorithms (e.g. DPO, RSO), offering new tools and insights for the algorithmic design of alignment algorithms.

  2. Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
    Jianqing Fan*, Zhaoran Wang*, Zhuoran Yang*, Chenlu Ye*, Preprint.
    A batching framework for high-dimensional multi-armed bandit problems, with simulations on both synthetic and real-world data.

  3. Corruption-Robust Offline Reinforcement Learning with General Function Approximation
    Chenlu Ye*, Rui Yang*, Quanquan Gu and Tong Zhang, NeurIPS 2023.
    An application of the uncertainty-weighting technique in offline reinforcement learning problems under adversarial corruption and general function approximation. Moreover, practical implementations under various data-corruption scenarios are carried out on the uncertainty-weighting algorithm, which outperforms the state-of-the-art.

  4. Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
    Yong Lin*, Chen Liu*, Chenlu Ye*, Qing Lian, Yuan Yao and Tong Zhang, Preprint.
    A Theoretically optimal and computationally efficient sample selection approach, which can be effectively applied to deep learning and is robust to misspecification (by down-weighting highly uncertain samples).

  5. Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
    Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.
    An application of uncertainty-weighted regression in the face of adversarial corruptions and under general function approximation: new weight design, and new techniques for controlling the sum of the (weighted) bonus (counterpart of the elliptical potential lemmas)