I am an assistant professor in the Department of Computer Science & Engineering (CSE) at the Hong Kong University of Science and Technology (HKUST). I am leading the Relexed System Lab. I am recruiting PhD students to work on research topics, including machine learning for data management and distributed/decentralized machine learning systems; please contact me by email with your latest resume if you are interested in joining us.
Before Joining HKUST I was a Postdoctoral Researcher in the Computer Science Department at ETH Zurich, under the supervision of Dr. Ce Zhang. I accomplished my Ph.D. program in Computer Science Department at Rice University. My adviser was Dr. Chris Jermaine and I was co-advised by Dr. Anastasios Kyrillidis for my Ph.D. thesis. I got my master degree from Computer Science Department at Rice University, supervised by Dr. Ron Goldman, and bachelor degree from Computer Science Department at Fudan University guided by Dr. Bo Yan for research.
Current Group Members
- Ran Yan (2023-Fall, BS@PKU)
- Tianyi Bai (2023-Fall, Intern@PKU-DAIR Lab)
- Jiashu Wang (2024-Spring, BS@PKU)
- Fangyu Ding (2024-Fall, MS,BS@SJTU)
- Wangcheng Tao (2024-Fall, BS@HUST)
- Guangxin He (2024-Fall, BS@UCAS, MS@CAS)
Zipeng Qiu (2024-Fall, BS@FDU)
- Yukun Zhou (Co-supervised with Prof. Wei Wang, 2024-Fall, BS@NJU, MS@THU)
- Youhe Jiang (Incoming PhD@Cambridge)
My main research focuses are data management for machine learning and distributed/decentralized machine learning systems. Concretely, I build systems to support giant foundation models over distributed and decentralized environments.
Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, and Ce Zhang. “High-throughput Generative Inference of Large Language Models with a Single GPU”. In International Conference on Machine Learning (pp. 31094-31116). PMLR. (ICML 2023 Selected as Oral).
Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Ré, and Beidi Chen. “Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time”. In International Conference on Machine Learning (pp. 22137-22176). PMLR. (ICML 2023 Selected as Oral).
Jue Wang, Yucheng Lu, Binhang Yuan, Beidi Chen, Percy Liang, Christopher De Sa, Christopher Ré, and Ce Zhang. “CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks “. In International Conference on Machine Learning (pp. 36058-36076). PMLR. (ICML 2023）
Yuxin Tang, Zhimin Ding, Dimitrije Jankov, Binhang Yuan, Daniel Bourgeois, and Chris Jermaine. “Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning”. In International Conference on Machine Learning (pp. 33581-33598). PMLR. (ICML 2023)
Binhang Yuan*, Yongjun He*, Jared Quincy Davis, Tianyi Zhang, Tri Dao, Beidi Chen, Percy Liang, Christopher Re, and Ce Zhang. “Decentralized Training of Foundation Models in Heterogeneous Environments.” In Advances in Neural Information Processing Systems 35 (2022), 25464-25477. (NeurIPS 2022 Selected as Oral)
Jue Wang*, Binhang Yuan*, Luka Rimanic*, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, and Ce Zhang. “Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees.” In Advances in Neural Information Processing Systems 35 (2022), 19215-19230. (NeurIPS 2022)
Rui Pan, Yiming Lei, Jialong Li, Zhiqiang Xie, Binhang Yuan, and Yiting Xia. “Efficient Flow Scheduling in Distributed Deep Learning Training with Echelon Formation.” In Proceedings of the twenty first ACM Workshop on Hot Topics in Networks (2022). (HotNets 2022)
Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen Yang, Ce Zhang, and Ji Liu. “Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters.” In Proceedings of the 28th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 3288-3298. 2022 (SIGKDD 2022)
Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, and Ce Zhang. “In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle.” In Proceedings of the 2022 International Conference on Management of Data, pp. 1286-1300. 2022. (SIGMOD 2022)
Binhang Yuan, Cameron R. Wolfe, Chen Dun, Yuxin Tang, Anastasios Kyrillidis, and Chris Jermaine. “Distributed Learning of Deep Neural Networks using Independent Subnet Training.” In Proceedings of the VLDB Endowment, 15(8). (VLDB 2022)
Shaoduo Gan, Xiangru Lian, Rui Wang, Jianbin Chang, Chengjun Liu, Hongmei Shi, Shengzhuo Zhang, Xianghong Li, Tengxu Sun, Jiawei Jiang, Binhang Yuan, Sen Yang, Ji Liu, and Ce Zhang. “BAGUA: Scaling up Distributed Learning with System Relaxations.” In Proceedings of the VLDB Endowment, 15(4). (VLDB 2022)
Shangyu Luo, Dimitrije Jankov, Binhang Yuan, and Chris Jermaine. “Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra.” In Proceedings of the 2021 International Conference on Management of Data(pp. 1222-1234). ACM. (SIGMOD 2021)
Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, and Chris Jermaine. “Tensor Relational Algebra for Machine Learning System Design.” In Proceedings of the VLDB Endowment, 14(8), 1338-1350 (VLDB 2021)
Jia Zou, Pratik Barhate, Amitabh Das, Arun Iyengar, Binhang Yuan, Dimitrije Jankov, and Chris Jermaine. “Lachesis: Automatic Partitioning for UDF-Centric Analytics.” In Proceedings of the VLDB Endowment, 14(8), 1262-1275 (VLDB 2021)
Dimitrije Jankov, Binhang Yuan, Shangyu Luo, and Chris Jermaine. “Distributed Numerical and Machine Learning Computations via Two-Phase Execution of Aggregated Join Trees.” In Proceedings of the VLDB Endowment, 14(7), 1228-1240. (VLDB 2021)
Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, and Zekai J Gao. “Declarative recursive computation on an RDBMS: or, why you should use a database for distributed machine learning.” In Proceedings of the VLDB Endowment, 12(7), 822-835. (VLDB 2019 Best Paper Honorable Mention Award, SIGMOD 2020 Reserch Highlight)
Jia Zou, R Matthew Barnett, Tania Lorido-Botran, Shangyu Luo, Carlos Monroy, Sourav Sikdar, Kia Teymourian, Binhang Yuan, and Chris Jermaine. “PlinyCompute: A platform for high-performance, distributed, data-intensive tool development.” In Proceedings of the 2018 International Conference on Management of Data(pp. 1189-1204). ACM. (SIGMOD 2018)
Binhang Yuan, Vijayaraghavan Murali, and Chris Jermaine. “Abridging source code.” In Proceedings of the ACM on Programming Languages 1.OOPSLA (2017): 58. (OOPSLA 2017)
Bo Yan, Binhang Yuan, and Bo Yang. “Effective video retargeting with jittery assessment”. In IEEE Transactions on Multimedia, Vol. 16, Issue 1, pp. 272-277, Jan. 2014. (TMM 2014)
Ph.D. Computer Science Department Rice University (2016/08 - 2020/12)
M.S. Computer Science Department Rice University (2013/08 - 2016/05)
B.S. Computer Science Department Fudan University (2009/09 - 2013/07)
Research Intern Microsoft Research Asia (2017/07 - 2017/12)
SDE Intern Tableau Software (2016/05 - 2016/08)
SDE Intern EMC Software (2015/05 - 2015/08)
- AAAI Reviewer: 2020, 2021
- ICLR Reviewer: 2022, 2023, 2024
- ICML Reviewer: 2021, 2022, 2023
- NeurIPS Reviewer: 2020, 2021, 2022, 2023
- MLsys Reviewer: 2024, Symposium Organizer: 2023, AE PC member*: 2022
- IEEE Access Reviewer: 2020
- IEEE TKDE Reviewer: 2022
- JMLR Reviewer: 2023
- IEEE BigData 2023
- PVLDB Reviewer: 2022-2023
Most of my spare time (if there is any) is spent on football. I am a big fan of Liverpool. As Klopp once said, “98% of football is about dealing with failure and still being able to smile and find joy in the game the next day.”, which is also true for research, I believe.