Chenlu Ye
I am a third-year MPhil student in IIP (AI) at The Hong Kong University of Science and Technology, and I am fortunate to be advised by Prof. Tong Zhang, Prof. Kani Chen, and Prof. Yuan Yao. Prior to this, I received a B.S. in Statistics from the University of Science and Technology of China in 2021. Additionally, I was a visiting scholar in AGI LAB @ UCLA, working with Prof. Quanquan Gu.
Research Interests
My research interests span the intersection of machine learning theory and statistics, with a particular emphasis on the foundations of reinforcement learning.
If you are interested in discussing or collaborating, please feel free to contact me via email: cyeab AT connect DOT ust DOT hk.
News
I am visiting AGI LAB @ UCLA from August 2023 to January 2024.
One paper was accepted by NeurIPS 2023!
Publications and Preprints
(α-β) means that the order is decided by rock–paper–scissors and (*) denotes equal contribution.
Gibbs Sampling from Human Feedback: A Provable KL-constrained Framework for RLHF
Wei Xiong*, Hanze Dong*, Chenlu Ye*, Han Zhong, Nan Jiang, Tong Zhang, Preprint.
We formulate the real-world RLHF process as a reverse-KL regularized contextual bandits and study its theoretical property by proposing statistically efficient algorithms with finite-sample theoretical guarantee. We also connect our theoretical findings with practical algorithms (e.g. DPO, RSO), offering new tools and insights for the algorithmic design of alignment algorithms.
Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
Jianqing Fan*, Zhaoran Wang*, Zhuoran Yang*, Chenlu Ye*, Preprint.
A batching framework for high-dimensional multi-armed bandit problems, with simulations on both synthetic and real-world data.
Corruption-Robust Offline Reinforcement Learning with General Function Approximation
Chenlu Ye*, Rui Yang*, Quanquan Gu and Tong Zhang, NeurIPS 2023.
An application of the uncertainty-weighting technique in offline reinforcement learning problems under adversarial corruption and general function approximation. Moreover, practical
implementations under various data-corruption scenarios are carried out on the uncertainty-weighting algorithm, which outperforms the state-of-the-art.
Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning
Yong Lin*, Chen Liu*, Chenlu Ye*, Qing Lian, Yuan Yao and Tong Zhang, Preprint.
A Theoretically optimal and computationally efficient sample selection approach, which can be effectively applied to deep learning and is robust to misspecification (by
down-weighting highly uncertain samples).
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.
An application of uncertainty-weighted regression in the face of adversarial corruptions and under general function approximation: new weight design, and new techniques for controlling the sum of the (weighted) bonus (counterpart of the elliptical potential lemmas)
|