Han Zhao 赵涵

Han Zhao 赵涵

Assistant Professor

Shanghai Jiao Tong University (SJTU)

Biography

I am an assistant professor at Computer Science and Engineering Department in Shanghai Jiao Tong University (SJTU). I received the Master and Ph.D. degrees from Shanghai Jiao Tong University under the supervision of Prof. Quan Chen and Prof. Minyi Guo. For now, I still work closely with Prof. Quan Chen and Assist Prof. Weihao Cui.

My past research focused on High performance computing on CPU architecture, Task scheduling on GPU architecture, Resource management in Datacenter, and DNN inference system design. My future research will cover more cloud computing topics such as GPU serverless and GPU virtualization. Specically, my future research could be divided into four aspects: Fine-grained GPU resource division, Multi-worker scheduling for DNN tasks, GPU serverless and Resource management in Datacenter.

I am now looking for perspective Ph.D students and Master Students. If you are interested in above areas, we should talk.

Interests
  • Cloud computing
  • Serverless computing
  • Task scheduling
  • DNN inference system
  • Resource management in Datacenter
Education
  • PhD in Computer Science, 2019-2022

    Shanghai Jiao Tong University

  • MSc in Computer Science, 2016-2019

    Shanghai Jiao Tong University

  • BSc in Computer Science, 2012-2016

    Huazhong University of Science and Technology

Recent Publications

(2024). Improving the Multi-Tenancy GPU Performance through Adaptive Bubbleless Spatial-Temporal Sharing. In ASPLOS2024 (CCF-A) (Accepted).

(2024). Improving the Multi-Tenancy GPU Performance through Adaptive Bubbleless Spatial-Temporal Sharing. In Eurosys2024 (CCF-A) (Revision).

(2023). Maximizing the Utilization of GPUs Used by Cloud Gaming through Adaptive Co-location with Combo. In SoCC2023 (CCF-B) (Corresponding author).

PDF Cite

(2023). Improving Cluster Utilization Through Adaptive Resource Management for Deep Neural Network and CPU Jobs Colocation. In TC2023 (CCF-A).

PDF Cite

(2022). ISPA: Exploiting Intra-SM Parallelism in GPUs via Fine-grained Resource Management. In TC2022 (CCF-A).

PDF Cite

(2022). DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUs. In ATC2022 (CCF-A).

PDF Cite

(2022). Tacker:Tensor-CUDA Core Kernel Fusion for Improving the GPU Utilization while Ensuring QoS. In HPCA2022 (CCF-A).

PDF Cite

(2021). Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction. In SC2021 (CCF-A).

PDF Cite

(2021). Exploiting Intra-SM Parallelism in GPUs via Persistent and Elastic Blocks. In ICCD2021 (CCF-B).

PDF Cite

(2020). E2bird: Enhanced Elastic Batch for Improving Responsiveness and Throughput of Deep Learning Services. In TPDS2020 (CCF-A).

PDF Cite

(2020). CODA: Improving Resource Utilization by Slimming and Co-locating DNN and CPU Jobs. In ICDCS2020 (CCF-B).

PDF Cite

(2019). Bandwidth and Locality Aware Task-stealing for Manycore Architectures with Bandwidth-Asymmetric Memory. In TACO2019 (CCF-A).

PDF Cite

Accomplish­ments

CCF体系结构优博奖
SC2021最佳实现奖

Contact