Chenlu Ye

picture 

Ph.D. student
University of Illinois Urbana-Champaign, Computer Science

[Curriculum Vitae]      [Google Scholar]

I am a first-year Ph.D. student in computer science at UIUC, where I am fortunate to advised by Prof. Tong Zhang. Prior to this, I obtained a master's degree in IIP (AI) at The Hong Kong University of Science and Technology, and received a B.S. in Statistics from the University of Science and Technology of China in 2021. Additionally, I was a visiting scholar in AGI LAB @ UCLA from 2023.08 to 2023.12, working with Prof. Quanquan Gu.

Research Interests

My research interests span the intersection of machine learning theory and statistics, with a particular emphasis on the foundations of reinforcement learning.

If you are interested in discussing or collaborating, please feel free to contact me via email: chenluy3 AT illinois DOT edu.

News

Publications and Preprints

(Alphabetical) means that the order is decided by rock–paper–scissors and (*) denotes equal contribution.

  1. A theoretical analysis of nash learning from human feedback under general kl-regularized preference
    Chenlu Ye*, Wei Xiong*, Yuheng Zhang*, Hanze Dong*, Nan Jiang, Tong Zhang, NeurIPS 2024.
    We study general preference without assuming the Bradley–Terry model. We propose sample efficient algorithms for both online and offline settings and validate their efficiency theoretically and empirically.

  2. Towards robust model-based reinforcement learning against adversarial corruption
    Chenlu Ye*, Jiafan He*, Quanquan Gu, Tong Zhang, ICML 2024.
    An analysis of uncertainty-aware algorithms in the model-based framework under adversarial corruption and general function approximation.

  3. Iterative preference learning from human feedback: Bridging theory and practice for rlhf under kl-constraint
    Wei Xiong*, Hanze Dong*, Chenlu Ye*, Han Zhong, Nan Jiang, Tong Zhang, ICML 2024.
    We formulate the real-world RLHF process as a reverse-KL regularized contextual bandit and study its theoretical property by proposing statistically efficient algorithms with a finite-sample theoretical guarantee. We also connect our theoretical findings with practical algorithms (e.g. DPO, RSO), offering new tools and insights for the algorithmic design of alignment algorithms.

  4. Corruption-Robust Offline Reinforcement Learning with General Function Approximation
    Chenlu Ye*, Rui Yang*, Quanquan Gu and Tong Zhang, NeurIPS 2023.
    An application of the uncertainty-weighting technique in offline reinforcement learning problems under adversarial corruption and general function approximation. Moreover, practical implementations under various data-corruption scenarios are carried out on the uncertainty-weighting algorithm, which outperforms the state-of-the-art.

  5. Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
    Yong Lin*, Chen Liu*, Chenlu Ye*, Qing Lian, Yuan Yao and Tong Zhang, Preprint.
    A Theoretically optimal and computationally efficient sample selection approach, which can be effectively applied to deep learning and is robust to misspecification (by down-weighting highly uncertain samples).

  6. Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
    Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.
    An application of uncertainty-weighted regression in the face of adversarial corruptions and under general function approximation: new weight design, and new techniques for controlling the sum of the (weighted) bonus (counterpart of the elliptical potential lemmas).

  7. Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
    Jianqing Fan*, Zhaoran Wang*, Zhuoran Yang*, Chenlu Ye* (Alphabetical), Preprint.
    A batching framework for high-dimensional multi-armed bandit problems, with simulations on both synthetic and real-world data