Home
Hello! I’m Yizhou Shan (εδΈθ), I’m a Research Scientist at Huawei Cloud. I earned my PhD from University of California San Diego, CSE under the supervision of Prof. Yiying Zhang.
I now run Serverless AI platform at Huawei Cloud, responsible for cost-efficient Model Serving (LLM, LMM, T2I, T2V, etc), Agent Serving, and Post-Training infrastructure. If you are interested in working with me (full-time or intern), we should talk.
Contact: syzwhat AT gmail DOT com You can find my CV here.
Blogging
Latest
- Nov 2022 SSD 101
- Jul 2022 CXL
- Apr 2022 MLIR
- Mar 2022 Resource Disaggregation Spectrum
- Feb 2022 Distributed Transactions
- Dec 2021 Notes on Modern Data Center Networking
Hot
- Oct 2019 FPGA Bitstream Explained
- May 2020 On DPDK and RDMA Related Software
- Jan 2020 Modern Virtualization
- Dec 2020 Dynamic Linking
- Jun 2019 Practical Cache Coherence
- Dec 2020 Architecture
- and more!
Research
- [Nov 2024] I will serve as a PC for FAST‘25, FAST‘26, ATC‘25.
- [Nov 2024] InstInfer accepted to HPCA‘25.
- [Apr 2024] I will serve as a NSDI‘25 PC.
- [Jan 2024] I will serve as a EuroSys‘25 PC.
- [Jan 2024] I will serve as a ATC‘24 PC.
- [Dec 2022] I will serve as a NSDI‘24 PC.
- [Nov 2022] MARB accepted to DATE‘23.
- [Oct 2022] HoPP accepted to HPCA‘23.
- [Sep 2022] I will serve as an ATC‘23 PC.
- [Jun 2022] A vision paper is accepted to APSys‘22
- [Jun 2022] Serve as EuroSys‘23 PC
- [Jun 2022] Serve as SoCC‘22 PC
- [Mar 2022] Serve as APSys‘22 PC
- [Mar 2022] Serve as ChinaSys‘22 PC
- [Mar 2022] Defended. The full defense slide is here.
- [Oct 2021] Serve as EuroSys‘22 Shadow PC
- [Sep 2021] We made our SuperNIC paper public.
- [Sep 2021] Serve as SOSP‘21 Artifact Evaluation PC
- [Aug 2021] We made our Clio paper public.
- [Jun 2021] Start my final internship at Microsoft Research, working on Security + System.
- [Jun 2021] I proposed my thesis and became a Ph.D candidate.
- [Jan 2021] The DPM work is accepted to present at NVMW‘21
- [Jan 2021] This summer, I’m going to do my last internship at MSR Redmond on cloud confidential computing.
- [Dec 2020] Invited to join the 2021 JSys Student Editorial Board
- [Oct 2020] Serve as EuroSys‘21 Shadow PC
- [Sep 2020] Serve as OSDI‘20 Artifact Evaluation PC
- [Sep 2020] Serve as ASPLOS‘21 External Reviewer. First major conference review!
- [Apr 2020] Disaggregated Persistent Memory accepted to ATC‘20
- [Feb 2020] Talk about FPGA OS
- [Sep 2019] Moved to UCSD.
- [May 2019] Intern at VMware Research, with Marcos K. Aguilera
- [Apr 2019] Storm accpeted to SYSTOR‘19. Awarded Best Paper.
- [Jan 2019] Short paper on Disaggregated Persistent Memory accpeted to NVMW‘19
- [Jul 2018] LegoOS accepted to OSDI‘18. Awarded Best Paper.
- [May 2018] Intern at VMware Research, with Stanko Novakovic.
Research¶
My main research interests span machine learning systems, distributed systems, data center networking, OS, hardware (FPGA), disaggregated memory/storage systems, and their intersections.
Serving LLMs at Cloud Scale
- EPIC, 2024 - Position-Independent KV caching
- InstInfer, HPCA‘25 - Programmable Attention Offload
- MemServe, 2024 - Disaggregated PD w/ Context Caching
- TetriServe, 2024 - Disaggregated PD
- CaraServe, 2024 - Multi-LoRA Serving
- The CAP Principle for LLM Serving, 2024 - a survey
Disaggregated Data Center Architecture
Disaggregated Memory
- HoPP, HPCA‘23 and MARB, DATE‘23 - Hardware-accelerated Prefetching for DisaggMem
- Clio, ASPLOS‘22 - An FPGA-based disaggregated memory device
- Clover, ATC‘20 - Pure one-sided KVS on disaggregated PM
- Storm, SYSTOR‘19 - Highly-efficient KVS on disaggregated memory
- Hotpot, SoCC‘17 - Transactional distributed PM over RDMA
Networking Design
- Storm, SYSTOR‘19 - RDMA Cards are evolving!
- SuperNIC, arXiv‘21 - An FPGA-based Programmable Multi-Host NIC
- Clio, ASPLOS‘22 - Rethinking RDMA NIC, congestion control
Publications¶
- CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference
Suyi Li, Hanfeng Lu, Tianyuan Wu, Minchen Yu, Qizhen Weng, Xusheng Chen, Yizhou Shan, Binhang Yuan, Wei Wang
[Preprint] [Code] - Inference without Interference: Disaggregate LLM Inference for Mixed Downstream Workloads
Cunchen Hu, Heyang Huang, Liangliang Xu, Xusheng Chen, Jiang Xu, Shuang Chen, Hao Feng, Chenxi Wang, Sa Wang, Yungang Bao, Ninghui Sun, Yizhou Shan
[Preprint] [Code] - Optimizing Hardware-Based Network Computation DAGs for Multiple Tenants with SuperNIC
Yizhou Shan, Will Lin, Ryan Kosta, Arvind Krishnamurthy, Yiying Zhang
[Preprint] [Code] - Skadi: Building a Distributed Runtime for Data Systems in Disaggregated Data Centers
Cunchen Hu, Chenxi Wang, Sa Wang, Ninghui Sun, Yungang Bao, Jieru Zhao, Sanidhya Kashyap, Pengfei Zuo, Xusheng Chen, Liangliang Xu, Qin Zhang, Hao Feng, Yizhou Shan
HotOS 2023 [Paper] - Core slicing: closing the gap between leaky confidential VMs and bare-metal cloud
Ziqiao Zhou, Yizhou Shan, Weidong Cui, Xinyang Ge, Marcus Peinado, Andrew Baumann
OSDI 2023 [Paper] - MARB: Bridge the Semantic Gap between Operating System and Application Memory Access Behavior
Haifeng Li, Ke Liu, Ting Liang, Zuojun Li, Tianyue Lu, Hui Yuan, Yinben Xia, Yungang Bao, Mingyu Chen, Yizhou Shan
DATE 2023 - HoPP: Hardware-Software Co-Designed Page Prefetching for Disaggregated Memory
Haifeng Li, Ke Liu, Ting Liang, Zuojun Li, Tianyue Lu, Hui Yuan, Yinben Xia, Yungang Bao, Mingyu Chen, Yizhou Shan
HPCA 2023 [Paper] - Towards a Fully Disaggregated and Programmable Data Center
Yizhou Shan, Will Lin, Zhiyuan Guo, Yiying Zhang
APSys 2022 [Paper] - Distributing and Disaggregating Hardware Resources in Data Centers
Yizhou Shan
UCSD Dissertation 2022 - Clio: A Hardware-Software Co-Designed Disaggregated Memory System
Yizhou Shan, Zhiyuan Guo (co-first authors), Xuhao Luo, Yutong Huang, Yiying Zhang
ASPLOS 2022 [Paper] [Code] [Slide] -
Disaggregating Persistent Memory and Controlling Them Remotely: An Exploration of Passive Disaggregated Key-Value Stores
Shin-Yeh Tsai, Yizhou Shan, Yiying Zhang
ATC 2020 [Paper] [Code] [Slide] [Short-Talk] [Full-Talk] [Keynote] -
Storm: a fast transactional dataplane for remote data structures
Stanko Novakovic, Yizhou Shan, Aasheesh Kolli, Michael Cui, Yiying Zhang, Haggai Eran, Liran Liss, Michael Wei, Dan Tsafrir, Marcos Aguilera
SYSTOR 2019 (Best Paper Award) [Paper] [Slide] [Talk] -
LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation
Yizhou Shan, Yutong Huang, Yilun Chen, Yiying Zhang
OSDI 2018 (Best Paper Award) [Paper] [Code] [Slide] [Keynote-iCloud] [Talk] -
Distributed Shared Persistent Memory
Yizhou Shan, Shin-Yeh Tsai, Yiying Zhang
SoCC 2017 [Paper] [Code] [Slide] [Poster]
Workshops¶
-
Disaggregating Persistent Memory and Controlling Them Remotely: An Exploration of Passive Disaggregated Key-Value Stores
Shin-Yeh Tsai, Yizhou Shan, Yiying Zhang
12th Annual Non-Volatile Memories Workshop (NVMW 2021) [Paper] -
Challenges in Building and Deploying Disaggregated Persistent Memory
Yizhou Shan, Yutong Huang, Yiying Zhang
10th Annual Non-Volatile Memories Workshop (NVMW 2019) [Paper] -
Disaggregating Memory with Software-Managed Virtual Cache
Yizhou Shan, Yiying Zhang
2018 Workshop on Warehouse-scale Memory Systems (WAMS 2018) (co-located with ASPLOS ‘18) [Paper] -
Distributed Shared Persistent Memory
Yizhou Shan, Shin-Yeh Tsai, Yiying Zhang
9th Annual Non-Volatile Memories Workshop (NVMW 2018) [Paper] -
Disaggregated Operating System
Yiying Zhang, Yizhou Shan, Sumukh Hallymysore
17th International Workshop on High Performance Transaction Systems (HPTS 2017) [Paper]
Professional Services¶
Program Committee
- FAST (2026, 2025)
- EuroSys (2025, 2024, 2023)
- ATC (2025, 2024, 2023)
- NSDI (2026, 2025, 2024)
- SoCC (2023, 2022)
Shadow/External Program Committee
- EuroSys (2022-shadow, 2021-shadow)
- ASPLOS (2021-external)
Journal Reviewer
- Journal of Systems Research: 2021 - Current
- ACM Transactions on Architecture and Code Optimization (TACO): 2021
- ACM Transactions on Storage (TOS): 2020
- IEEE/ACM Transactions on Networking: 2020
Artifact Evaluation Committee
- SOSP (2021)
- OSDI (2020)
Social¶