Han Zhao
Han Zhao
Home
Publications
Projects
Contact
Light
Dark
Automatic
Publications
Type
Conference paper
Journal article
Date
2025
2024
2023
2022
2021
2020
2019
Han Zhao
,
Weihao Cui
,
Quan Chen
,
Zijun Li
,
Zhenhua Han
,
Nan Wang
,
Yu Feng
,
Jieru Zhao
,
Chen Chen
,
Jingwen Leng
,
Minyi Guo
(2025).
EDAS: Enabling Fast Data Loading for GPU Serverless Computing
. In
TACO2025 (CCF-A)
.
PDF
Pengyu Yang
,
Weihao Cui
,
Chunyu Xue
,
Han Zhao
,
Chen Chen
,
Quan Chen
,
Jing Yang
,
Minyi Guo
(2025).
Taming Flexible Job Packing in Deep Learning Training Clusters
. In
TACO2025 (CCF-A)
.
PDF
Cite
Weihao Cui
,
Ji Zhang
,
Han Zhao
,
Chao Liu
,
Wenhao Zhang
,
Jian Sha
,
Quan Chen
,
Bingsheng He
,
Minyi Guo
(2025).
XPUTIMER: Anomaly Diagnostics for Divergent LLM Training in GPU Clusters of Thousand-Plus Scale
. In Arxiv (Under review).
PDF
Weihao Cui
,
Ziyi Xu
,
Han Zhao
,
Quan Chen
,
Zijun Li
,
Bingsheng He
,
Minyi Guo
(2025).
Efficient Function-as-a-Service for Large Language Models with TIDAL
. In Arxiv (Under review).
PDF
Weihao Cui
,
Yukang Chen
,
Han Zhao
,
Ziyi Xu
,
Quan Chen
,
Xusheng Chen
,
Yangjie Zhou
,
Shixuan Sun
,
Minyi Guo
(2025).
Optimizing SLO-oriented LLM Serving with PD-Multiplexing
. In Arxiv (Under review).
PDF
Chunyu Xue
,
Weihao Cui
,
Han Zhao
,
Quan Chen
,
Shulai Zhang
,
Pengyu Yang
,
Jing Yang
,
Shaobo Li
,
Minyi Guo
(2025).
A Codesign of Scheduling and Parallelization for Large Model Training in Heterogeneous Clusters
. In Arxiv (Under review).
Shulai Zhang
,
Quan Chen
,
Weihao Cui
,
Han Zhao
,
Chunyu Xue
,
Zhen Zheng
,
Wei Lin
,
Minyi Guo
(2025).
Improving GPU Sharing Performance through Adaptive Bubbleless Spatial-Temporal Sharing
. In
Eurosys2025 (CCF-A)
.
PDF
Yifu He
,
Han Zhao
,
Quan Chen
,
Weihao Cui
,
Minyi Guo
(2025).
ARACHNE: Optimizing Distributed Parallel Applications with Reduced Inter-Process Communication
. In
TACO2025 (CCF-A)
.
PDF
Yu Feng
,
Weikai Lin
,
Zihan Liu
,
Jingwen Leng
,
Minyi Guo
,
Han Zhao
,
Xiaofeng Hou
,
Jieru Zhao
,
Yuhao Zhu
(2024).
Potamoi: Accelerating neural rendering via a unified streaming architecture
. In
TACO2024 (CCF-A)
.
PDF
Cite
Han Zhao
,
Junxiao Deng
,
Weihao Cui
,
Quan Chen
,
Youtao Zhang
,
Deze Zeng
,
Minyi Guo
(2024).
Adaptive Kernel Fusion for Improving the GPU Utilization while Ensuring QoS
. In
TC2024 (CCF-A)
.
PDF
Cite
Han Zhao
,
Junxiao Deng
,
Weihao Cui
,
Deze Zeng
,
Jing Yang
,
Minyi Guo
(2024).
Exploiting all intra-SM parallelism to maximize the throughput while ensuring QoS
. In
Chinese Science Information Science 2024 (CCF-A)
.
PDF
Chuhao Xu
,
Yiyu Liu
,
Zijun Li
,
Quan Chen
,
Han Zhao
,
Deze Zeng
,
Qian Peng
,
Xueqi Wu
,
Haifeng Zhao
,
Senbo Fu
,
Minyi Guo
(2024).
FaaSMem: Improving Memory Efficiency of Serverless Computing with Memory Pool Architecture
. In
ASPLOS2024 (CCF-A)
.
PDF
Cite
Binghao Chen
,
Han Zhao
,
Weihao Cui
,
Yifu He
,
Shulai Zhang
,
Quan Chen
,
Zijun Li
,
Minyi Guo
(2023).
Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo
. In
SoCC2023 (CCF-B) (Corresponding author)
.
PDF
Cite
Han Zhao
,
Weihao Cui
,
Quan Chen
,
Jingwen Leng
,
Deze Zeng
,
Minyi Guo
(2023).
Improving Cluster Utilization Through Adaptive Resource Management for Deep Neural Network and CPU Jobs Colocation
. In
TC2023 (CCF-A)
.
PDF
Cite
Han Zhao
,
Weihao Cui
,
Quan Chen
,
Minyi Guo
(2022).
ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-grained Resource Management
. In
TC2022 (CCF-A)
.
PDF
Cite
Weihao Cui
,
Han Zhao
,
Quan Chen
,
Hao Wei
,
Zirui Li
,
Deze Zeng
,
Chao Li
,
Minyi Guo
(2022).
DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs
. In
ATC2022 (CCF-A)
.
PDF
Cite
Han Zhao
,
Weihao Cui
,
Quan Chen
,
Youtao Zhang
,
Yanchao Lu
,
Chao Li
,
Jingwen Leng
,
Minyi Guo
(2022).
Tacker:Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS
. In
HPCA2022 (CCF-A)
.
PDF
Cite
Weihao Cui
,
Han Zhao
,
Quan Chen
,
Ningxin Zheng
,
Jingwen Leng
,
Jieru Zhao
,
Zhuo Song
,
Tao Ma
,
Yong Yang
,
Chao Li
,
Minyi Guo
(2021).
Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction
. In
SC2021 (CCF-A)
.
PDF
Cite
Han Zhao
,
Weihao Cui
,
Quan Chen
,
Jieru Zhao
,
Jingwen Leng
,
Minyi Guo
(2021).
Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks
. In
ICCD2021 (CCF-B)
.
PDF
Cite
Weihao Cui
,
Quan Chen
,
Han Zhao
,
Mengze Wei
,
Xiaoxin Tang
,
Minyi Guo
(2020).
E2bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services
. In
TPDS2020 (CCF-A)
.
PDF
Cite
Han Zhao
,
Weihao Cui
,
Quan Chen
,
Jingwen Leng
,
Kai Yu
,
Deze Zeng
,
Chao Li
,
Minyi Guo
(2020).
CODA: Improving Resource Utilization by Slimming and Co-locating DNN and CPU Jobs
. In
ICDCS2020 (CCF-B)
.
PDF
Cite
Han Zhao
,
Quan Chen
,
Yuxian Qiu
,
Ming Wu
,
Yao Shen
,
Jingwen Leng
,
Chao Li
,
Minyi Guo
(2019).
Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory
. In
TACO2019 (CCF-A)
.
PDF
Cite
Cite
×