Wei XIONG

PhD Student

Phone: (852)
Email: wxiongae@connect.ust.hk
Office: Room

Papers and Preprints

(* equal contribution or alphabetical order)

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Hanze Dong*, Wei Xiong*, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum and Tong Zhang, Preprint.
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond [Slide]
Han Zhong*, Wei Xiong*, Sirui Zheng, Liwei Wang, Zhaoran Wang, Zhuoran Yang and Tong Zhang, Preprint.
Reward Teaching for Federated Multi-Armed Bandits
Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, IEEE International Symposium on Information Theory (ISIT 2023)
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources
Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, ICML 2023.
Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes
Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game [Slide]
Wei Xiong*, Han Zhong*, Chengshuai Shi, Cong Shen, Liwei Wang, and Tong Zhang, ICLR 2023.
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Game
Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, and Tong Zhang, ICML 2022.
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
Han Zhong*, Wei Xiong*, Jiyuan Tan*, Liwei Wang, Tong Zhang, Zhaoran Wang, and Zhuoran Yang, ICML 2022.
PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction [Slide]
Haishan Ye*, Wei Xiong*, and Tong Zhang, Under Minor Revision at TPAMI.
Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization [Code]
Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, NeurIPS, 2021.
(Almost) Free Incentivized Exploration from Decentralized Learning Agents [Code]
Chengshuai Shi, Haifeng Xu, Wei Xiong, and Cong Shen, NeurIPS, 2021.
Distributional Reinforcement Learning for Multi-Dimensional Reward Functions
Pushi Zhang, Xiaoyu Chen, Li Zhao, Wei Xiong, Tao Qin, and Tie-Yan Liu, NeurIPS, 2021.
Decentralized multi-player multi-armed bandits with no collision information [Code*]
Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, AISTATS, 2020.