Sections
Left Column
Image
Image
Wei XIONG
Image Caption

Wei XIONG

Right Column
Text Area

PhD Student

Phone: (852)
Email: wxiongae@connect.ust.hk
Office: Room

Papers and Preprints

(* equal contribution or alphabetical order)

  1. RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment 
    Hanze Dong*, Wei Xiong*, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum and Tong Zhang, Preprint.
     

  2. GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond [Slide] 
    Han Zhong*, Wei Xiong*, Sirui Zheng, Liwei Wang, Zhaoran Wang, Zhuoran Yang and Tong Zhang, Preprint.
     

  3. Reward Teaching for Federated Multi-Armed Bandits
    Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, IEEE International Symposium on Information Theory (ISIT 2023)
     

  4. Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources
    Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, ICML 2023.
     

  5. Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes 
    Chenlu Ye, Wei Xiong, Quanquan Gu and Tong Zhang, ICML 2023.
     

  6. Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game [Slide] 
    Wei Xiong*, Han Zhong*, Chengshuai Shi, Cong Shen, Liwei Wang, and Tong Zhang,  ICLR 2023. 
     

  7. A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Game 
    Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, and Tong Zhang, ICML 2022.
     

  8. Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets 
    Han Zhong*, Wei Xiong*, Jiyuan Tan*, Liwei Wang, Tong Zhang, Zhaoran Wang, and Zhuoran Yang, ICML 2022. 

  9. PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction  [Slide] 
    Haishan Ye*, Wei Xiong*, and Tong Zhang, Under Minor Revision at TPAMI
     

  10. Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization [Code
    Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, NeurIPS, 2021.
     

  11. (Almost) Free Incentivized Exploration from Decentralized Learning Agents [Code
    Chengshuai Shi, Haifeng Xu, Wei Xiong, and Cong Shen, NeurIPS, 2021.
     

  12. Distributional Reinforcement Learning for Multi-Dimensional Reward Functions  
    Pushi Zhang, Xiaoyu Chen, Li Zhao, Wei Xiong, Tao Qin, and Tie-Yan Liu, NeurIPS, 2021.
     

  13. Decentralized multi-player multi-armed bandits with no collision information [Code*
    Chengshuai Shi, Wei Xiong, Cong Shen, and Jing Yang, AISTATS, 2020.