Jinwei Yao
My English name is Kivi.
Master of Science in Computer Science β’ Sibel Computer Science Center β’ UIUC
201 N Goodwin Ave, Urbana, IL 61801

Background
I am currently in the first year at UIUC, pursuing a (research-based) Master of Science in Computer Science(MSCS). My advisor is Prof.Jiaxuan You.
Before my journey at UIUC, I spent one year at EPFL as a felloship PhD student in distributed systems, where I laid my academic foundations. Leaving peaceful and beautiful Switzerland is a hard decision: after one year of thinking and discussion with my career mentor Prof. Katerina Argyraki, I followed my heart to explore ML System research. In Zhejiang University, I obtained my Bachelor's degree in Electronic Science and Technology, with Outstanding Thesis Award for designing FPGA subsystem for GNN acceleration, where was the start of my MLSys research. During my ML System research journey, I was lucky to work with wonderful advisors-- Prof. Zeke Wang(Zhejiang University), Prof. Tao Lin(Westlake University), and Prof. Binhang Yuan(HKUST).
At UIUC, my research focuses on System-algorithm co-design for GenAI. System is my start point but I am doing algorithms as well. You can find my research interests as follows.
Research Interest
I am interested in machine learning systems (ML System), especially in algorithm co-design across modeling, systems, and hardware. There exists a significant gap between generative model design, system implementation, and hardware capabilities. My research aims to bridge this gap by developing efficient, robust, and scalable algorithms/systems for real-world LLM applications.
I categorize my research interests into three interdependent dimensions:
-
System: Efficiency & Robustness in LLM Infrastructure
- LLM Inference Efficiency: How to provide cheap and fast LLM inference services?
- LLM Training Efficiency: How to train LLMs with limited resources while ensuring robustness?
- ML-System SLO Trade-off: How to balance ML performance metrics (e.g., accuracy, perplexity) with system metrics (e.g., latency, throughput)?
-
Modeling: Beyond Auto-Regressive Patterns
- How can we rethink or extend generative modeling paradigms beyond the auto-regressive (AR) models?
- How to design architectures that are more expressive and efficient than AR models?
- How to unify multimodal inputs (e.g., text, vision, code) into a shared and coherent representation space?
-
Hardware: Hardware-Aware Algorithm Design
- How can we design algorithms that fully leverage heterogeneous hardware such as FPGAs, GPUs, and NPUs?
- How to abstract hardware features into software libraries to simplify hardware-efficient algorithm development?
- How to co-design algorithms with low-level primitives to maximize hardware utilization?
Miscellaneous
I am active in sharing paper readings on my another Github Blog and Zhihu(η₯δΉ). I like π, πͺ, π, π±, and π¬.
News
π DeFT got accepted by ICLRβ25β as Spotlight(Top 5%)! Code is released here.
π Enrolled at MSCS Program and began a new semester at UIUC.
π DeFT got accepted by ICLRβ24 AGI Workshop as Oral Presentation!
π» After one-year of consideration and suggested by my career mentor, I decided to leave EPFL for ML System Research as no profs in EPFL are interested in ML System.
π» Enrolled EPFL as a PhD student with a fellowship from CS department.
πͺ Graduated from Zhejiang University and received the B.Eng. in Electronic Science and Technology (with Outstanding Graduation Award and Outstanding Thesis Award).