跳到主要內容
Featured Researchers

褚曉文教授

理學院
計算機科學系教授

返回
褚曉文教授

聯絡

簡歷

深度學習是許多現代人工智能應用程序背後的關鍵技術,而褚曉文教授的研究工作致力於提高深度學習系統的實際運作性能。早在2016年,他的研究團隊研發了最早的開源深度學習平台基準測試集,用以評估、比較和分析不同深度學習平台的性能。這項開拓性的工作引起了包括CNTK團隊,MXNet團隊,英偉達,英特爾,騰訊和浪潮等深度學習社區的極大關注。自2018年以來,褚教授的團隊進一步提出了許多創新的方法來減少在大規模GPU集群上進行AI模型訓練的時間。

 

作為GPU計算領域的先驅之一,褚教授的研究小組設計並實現了許多基於GPU的並行算法,顯著的減少了很多實際應用軟件的運行時間,其中包括DNA序列比對,蛋白質鑑定,數據挖掘,數據安全,線性代數運算和深度學習等等。他的團隊不僅發表了許多高引用的學術論文,而且還開發了一系列的開源軟件。例如,關於DNA序列比對的研究工作G-BLASTN不僅發表在生物信息學領域的頂級期刊Bioinformatics,而且該軟件在Sourceforge和Github上獲得了一千餘次下載。褚教授最近的論文“Dissecting GPU Memory Hierarchy through Microbenchmarking”提出了一種新穎的細粒度P-Chase算法,用以揭示GPU內存系統中許多未知的特性。這是第一篇成功的分析英偉達的Kepler和Maxwell GPU緩存特性的論文,並且該方法被學術界廣泛用於:(1)分析最新的GPU(例如Volta和Turing)的內存系統;(2)準確預測GPU應用的性能,例如深度學習中的並行卷積運算;(3)設計對緩存友好的矩陣運算和卷積運算的優化算法。

 

褚教授已在並行和分佈式計算,深度學習系統,雲計算,無線網絡等領域發表了170餘篇研究論文。他的許多著作發表在國際知名期刊,包括IEEE Transactions on Parallel and Distributed Systems ,IEEE Journal on Selected Areas in Communications,IEEE Transactions on Mobile Computing,IEEE/ACM Transactions on Networking,IEEE Transactions on Smart Grid,IEEE Transactions on Computers,和Bioinformatics;以及國際知名會議論文集,包括IEEE INFOCOM,ACM PPoPP,IEEE ICDCS,IEEE IPDPS,IJCAI,IEEE ICRA和ACM e-Energy。 褚教授的研究工作在學術界和工業界都有廣泛的影響。 他已獲得香港研究資助局,創新及科技基金,及業界的多項外部研究資助及捐款,總額超過港幣一千八百萬元。

 

成就

  • 最佳論文獎 第四屆IEEE International Conference on Big Data Intelligence and Computing (2018)
  • 最佳論文獎 第一屆 International Conference on Big Data Computing and Communications (2015)
  • 最佳論文獎 第十屆IEEE International Conference on Computer and Information Technology (2010)

 

出版

  • Zeng, R., S. Zhang, J. Wang & X. W. Chu. “FMore: An Incentive Scheme of Multi-dimensional Auction for Federated Learning in MEC.” IEEE ICDCS (2020). [Link]
  • Shi, S., Q. Wang, X. W. Chu, B. Li, Y. Qin, R. Liu & X. Zhao. “Communication-Efficient Distributed Deep Learning with Merged Gradient Sparsification on GPUs.” IEEE INFOCOM (2020). [Link]
  • Yan, D., W. Wang & X. W. Chu. “Demystifying Tensor Cores to Optimize Half-Precision Matrix Multiply.” IEEE International Parallel and Distributed Processing Symposium (2020). [Link]
  • Shi, S., Z. Tang, Q. Wang, K. Zhao & X. W. Chu. “Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees.” The 24th European Conference on Artificial Intelligence (2020). [Link]
  • Yan, D., W. Wang & X. W. Chu. “Optimizing Batched Winograd Convolution on GPUs,” ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2020). [Link]
  • Shi, S., K. Zhao, Q. Wang, Z. Tang & X. W. Chu. “A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient Sparsification.” The 28th International Joint Conference on Artificial Intelligence (2019). [Link]
  • Shi, S., Q. Wang, K. Zhao, Z. Tang, Y. Wang, X. Huang & X. W. Chu. “A Distributed Synchronous SGD Algorithm with Global Top-k Sparsification for Low Bandwidth Networks.” The 39th IEEE International Conference on Distributed Computing Systems (2019). [Link]
  • Tang, Z., Y. Wang, Q. Wang & X. W. Chu. “The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study.” The 10th ACM International Conference on Future Energy Systems (e-Energy) (2019). [Link]
  • Shi, S., X.-W. Chu & B. Li. “MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms.” IEEE INFOCOM (2019). [Link]
  • Jia X., S. Song, S. Shi, W. He, Y. Wang, H. Rong, F. Zhou, L. Xie, Z. Guo, Y. Yang, L. Yu, T. Chen, G. Hu & X. W. Chu. “Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes.” NeurIPS 2018 Workshop on Systems for ML and Open Source Software (2018). [Link]
  • Zhao, H., H. Liu, Y.-W. Leung & X. W. Chu. “Self-Adaptive Collective Motion of Swarm Robots.” IEEE Transactions on Automation Science and Engineering 15.4 (2018): 1533-1545. [Link]
  • Liu, C., Q. Wang, X.-W. Chu & Y. W. Leung. “G-CRS: GPU Accelerated Cauchy Reed-Solomon Coding.” IEEE Transactions on Parallel and Distributed Systems 29.7 (2018): 1482-1498. [Link]
  • Zhang, F., H. Liu, Y. W. Leung, X.-W. Chu & B. Jin. “CBS: Community-based Bus System as Routing Backbone for Vehicular Ad Hoc Networks.” IEEE Transactions on Mobile Computing 16.8 (2017): 2132-2146. [Link]
  • Wang, Q., P. Xu, Y. Zhang & X. W. Chu. “EPPMiner: An Extended Benchmark Suite for Energy, Power and Performance Characterization of Heterogeneous Architecture.” The 8th ACM International Conference on Future Energy Systems (e-Energy) (2017). [Link]
  • Chau, V., X. W. Chu, H. Liu & Y.-W. Leung. “Energy Efficient Job Scheduling with DVFS for CPU-GPU Heterogeneous Systems.” The 8th ACM International Conference on Future Energy Systems (e-Energy) (2017). [Link]
  • Mei, X., X. W. Chu, H. Liu, Y.-W. Leung & Z. Li. “Energy Efficient Real-time Task Scheduling on CPU-GPU Hybrid Clusters.” IEEE INFOCOM (2017). [Link]
  • Mei, X. & X. W. Chu. "Dissecting GPU Memory Hierarchy through Microbenchmarking." IEEE Transactions on Parallel and Distributed Systems 28.1 (2017): 72-86. [Link]
  • Lam, Albert Y. S., Y. W. Leung & X. W. Chu. “Autonomous Vehicle Public Transportation System: Scheduling and Admission Control.” IEEE Transactions on Intelligent Transportation Systems 17.5 (2016): 1210-1226. [Link]
  • Yu, L., H. Liu, Y. W. Leung, X.-W. Chu & Z. Lin. “Multiple Radios for Fast Rendezvous in Cognitive Radio Networks.” IEEE Transactions on Mobile Computing 14.9 (2015): 1917-1931. [Link]
  • Zhang, F., H. Liu, Y. W. Leung, X.-W. Chu & B. Jin. “Community-based Bus System as Routing Backbone for Vehicular Ad Hoc Networks.” IEEE ICDCS (2015). [Link]
  • Zhao, J., X. W. Chu, H. Liu, Y. W. Leung & Z. Li. “Online Procurement Auctions for Resource Pooling in Client-Assisted Cloud Storage Systems.” IEEE INFOCOM (2015). [Link]
  • Lam, Albert Y. S., Y. W. Leung & X. W. Chu. “Electric Vehicle Charging Station Placement: Formulation, Complexity, and Solutions.” IEEE Transactions on Smart Grid 5.6 (2014): 2846–2856. [Link]
  • Zhao K. & X. W. Chu. “G-BLASTN: Accelerating Nucleotide Alignment by Graphics Processors.” Bioinformatics 30.10 (2014): 1384-91. [Link]
  • Liu, H., X. W. Chu, Y. W. Leung & R. Du. “Minimum-Cost Sensor Placement for Required Lifetime in Wireless Sensor-Target Surveillance Networks.” IEEE Transactions on Parallel and Distributed Systems 24.9 (2013): 1783-1796. [Link]
  • Lin, Z., H. Liu, X. W. Chu & Y. W. Leung. “Enhanced Jump-Stay Rendezvous Algorithm for Cognitive Radio Networks,” IEEE Communications Letters 17.9 (2013): 1742-1745. [Link]
  • Lin, Z., H. Liu, X. W. Chu, Y. W. Leung & I. Stojmenovic. “Constructing Connected-Dominating-Set with Maximum Lifetime in Cognitive Radio Networks.” IEEE Transactions on Computers (2013). [Link]
  • Liu, H., Z. Lin, X. W. Chu & Y. W. Leung. “Jump-Stay Rendezvous Algorithm for Cognitive Radio Networks.” IEEE Transactions on Parallel and Distributed Systems 23.10 (2012): 1867-1881. [Link]
  • Li, Z. & X.-W. Chu. “On Achieving Group-Strategyproof Multicast.” IEEE Transactions on Parallel and Distributed Systems 23.5 (2012): 913-923. [Link]
  • Liu, C. M., T. Wong, E. Wu, R. Luo, S. M. Yiu, Y. Li, B. Wang, C. Yu, X. W. Chu, K. Zhao, R. Li & T. W. Lam. “SOAP3: Ultra-fast GPU-based parallel alignment tool for short reads.” Bioinformatics 28.6 (2012):878-879. [Link]
  • Liu, H., X. W. Chu, Y. W. Leung, X. Jia & P. Wan. “General Maximal Lifetime Sensor-Target Surveillance Problem and Its Solution.” IEEE Transactions on Parallel and Distributed Systems 22.10 (2011): 1757-1765. [Link]
  • Lin, Z., H. Liu, X. W. Chu & Y. W. Leung. “Jump-Stay Based Channel-Hopping Algorithm with Guaranteed Rendezvous for Cognitive Radio Networks.” IEEE INFOCOM (2011). [Link]
  • Liu, H., X. W. Chu, Y. W. Leung & R. Du. “Simple Movement Control Algorithm for Bi-Connectivity in Robotic Sensor Networks.” IEEE Journal on Selected Areas in Communications 28.7 (2010): 994-1005. [Link]
  • Chu, X. W. & Y. Jiang. “Random Linear Network Coding for Peer-to-Peer Applications.” IEEE Network 24.4 (2010): 35-39. [Link]
  • Chu, X. W., K. Zhao, Z. Li & A. Mahanti. “Auction-Based On-Demand P2P Min-Cost Media Streaming with Network Coding.” IEEE Transactions on Parallel and Distributed Systems 20.12 (2009): 1816-1829. [Link]
  • Li, X. Y., Y. Wu, H. Chen, X. W. Chu, Y. Wu & Y. Qi. “Reliable and Energy Efficient Routing for Static Wireless Ad Hoc Networks with Unreliable Links.” IEEE Transactions on Parallel and Distributed Systems 20.10 (2009): 1408-1421. [Link]
  • Liu, J. C., J. Xu & X. W. Chu. “Fine-Grained Scalable Video Caching for Heterogeneous Clients.” IEEE Transactions on Multimedia 8.5 (2006): 1011-1020. [Link]
  • Chu, X. W. & B. Li. “Dynamic Routing and Wavelength Assignment in the Presence of Wavelength Conversion for All-Optical Networks.” IEEE/ACM Transactions on Networking 12.3 (2005): 704-715. [Link]
  • Chu, X. W., J. Liu, B. Li & Z. Zhang. “Analytical Model of Sparse-Partial Wavelength Conversion in Wavelength-Routed WDM Networks.” IEEE Communications Letters 9.1 (2005): 69-71. [Link]
  • Chu, X. W., J. Liu & Z. Zhang. “Analysis of Sparse-Partial Wavelength Conversion in Wavelength-Routed WDM Networks.” IEEE INFOCOM (2004). [Link]
  • Liu, J., X. W. Chu & J. Xu. “Proxy Cache Management for Fine-Grained Scalable Video Streaming.” IEEE INFOCOM (2004). [Link]
  • Sohraby, K., Z. Zhang, X. W. Chu & B. Li. “Resource Management in an Integrated Optical Network.” IEEE Journal on Selected Areas in Communications 21.7 (2003): 1052-1062. [Link]
  • Li, B., X. W. Chu & K. Sohraby. “Routing and Wavelength Assignment vs. Wavelength Converter Placement in All-Optical Networks.” IEEE Communications Magazine 41.8 (2003): S22-S28. [Lhttps://ieeexplore.ieee.org/document/1222717ink]
  • Chu, X. W., B. Li & I. Chlamtac. “Wavelength Converter Placement under Different RWA Algorithms in Wavelength-Routed All-Optical Networks.” IEEE Transactions on Communications 51.4 (2003): 607-617. [Link]
  • Chu, X. W., B. Li & Z. Zhang. “A Dynamic RWA Algorithm in a Wavelength-Routed All-Optical Network with Wavelength Converters.” IEEE INFOCOM (2003): 1795-1804. [Link]