Skip to main content
Featured Researchers

Professor Xiaowen Chu

Professor
Department of Computer Science
Faculty of Science

BACK
Professor Xiaowen Chu

Contact

About

Professor Chu Xiaowen  has focused on improving the performance of deep learning systems, which are the key techniques behind many modern artificial intelligence (AI) applications. In 2016, his research team developed one of the earliest open-source benchmarking suites named DLBench that can evaluate, trace and compare the performance of different deep learning platforms. This pioneering work attracted significant attention from the deep learning community, including Microsoft’s CNTK team, MXNet team, Nvidia, Intel, Tencent, and Inspur. Similar benchmarks have been launched by academia (such as Stanford’s DAWNBench in 2017) and industry (MLPerf in 2018). Since 2018, Professor Chu’s team has proposed many novel methods to reduce the training time of AI models on graphics processing unit (GPU) clusters. 

 

As a pioneer in the field of GPU computing, Professor Chu’s research group has designed and implemented many GPU-based parallel algorithms and software to significantly reduce the running time of real-world problems, including DNA sequence alignment, protein identification, data mining, data security, linear algebra routines and deep learning. His team has not only published many highly cited papers, but also developed several open-source software. For example, the work of G-BLASTN was not only published in the top-tier journal Bioinformatics, but also received more than 1,000 downloads from Sourceforge and Github. His recent work, “Dissecting GPU Memory Hierarchy through Microbenchmarking”, has proposed a novel fine-grained P-Chase algorithm to expose many previously unknown properties of GPU memory systems. It is the first paper to successfully analyse the cache properties of Keper and Maxwell GPUs, and the proposed method has been widely used by the research community to (1) analyse the memory system of recent GPUs (such as Volta and Turing); (2) accurately predict the performance of GPU applications such as parallel convolution operations in deep learning; and (3) design cache-friendly optimisation techniques for matrix operations and convolutions. 

 

Professor Chu has published more than 170 research papers in the field of parallel and distributed computing, deep learning systems, cloud computing, wireless networks, etc. Many of his works have been published in prestigious international journals, including IEEE Transactions on Parallel and Distributed Systems, IEEE Journal on Selected Areas in Communications, IEEE Transactions on Mobile Computing, IEEE/ACM Transactions on Networking, IEEE Transactions on Smart Grid, IEEE Transactions on Computers and Bioinformatics, and also top-ranked international conference proceedings including IEEE INFOCOM, ACM PPoPP, IEEE ICDCS, IEEE IPDPS, IJCAI, IEEE ICRA and ACM e-Energy. Professor Chu is dedicated to impactful research that can influence not only the research community but also industry practitioners. He has received many external research grants and donations from the Research Grants Council, the Innovation and Technology Fund and industry, totalling more than HK$18 million.

 

Achievements

  • Best Paper Award of the Fourth IEEE International Conference on Big Data Intelligence and Computing (2018)
  • Best Paper Award of the 1st International Conference on Big Data Computing and Communications (2015)
  • Best Paper Award of the 10th IEEE International Conference on Computer and Information Technology (2010)

 

Research Outputs

  • Zeng, R., S. Zhang, J. Wang & X. W. Chu. “FMore: An Incentive Scheme of Multi-dimensional Auction for Federated Learning in MEC.” IEEE ICDCS (2020). [Link]
  • Shi, S., Q. Wang, X. W. Chu, B. Li, Y. Qin, R. Liu & X. Zhao. “Communication-Efficient Distributed Deep Learning with Merged Gradient Sparsification on GPUs.” IEEE INFOCOM (2020). [Link]
  • Yan, D., W. Wang & X. W. Chu. “Demystifying Tensor Cores to Optimize Half-Precision Matrix Multiply.” IEEE International Parallel and Distributed Processing Symposium (2020). [Link]
  • Shi, S., Z. Tang, Q. Wang, K. Zhao & X. W. Chu. “Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees.” The 24th European Conference on Artificial Intelligence (2020). [Link]
  • Yan, D., W. Wang & X. W. Chu. “Optimizing Batched Winograd Convolution on GPUs,” ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2020). [Link]
  • Shi, S., K. Zhao, Q. Wang, Z. Tang & X. W. Chu. “A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient Sparsification.” The 28th International Joint Conference on Artificial Intelligence (2019). [Link]
  • Shi, S., Q. Wang, K. Zhao, Z. Tang, Y. Wang, X. Huang & X. W. Chu. “A Distributed Synchronous SGD Algorithm with Global Top-k Sparsification for Low Bandwidth Networks.” The 39th IEEE International Conference on Distributed Computing Systems (2019). [Link]
  • Tang, Z., Y. Wang, Q. Wang & X. W. Chu. “The Impact of GPU DVFS on the Energy and Performance of Deep Learning: an Empirical Study.” The 10th ACM International Conference on Future Energy Systems (e-Energy) (2019). [Link]
  • Shi, S., X.-W. Chu & B. Li. “MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms.” IEEE INFOCOM (2019). [Link]
  • Jia X., S. Song, S. Shi, W. He, Y. Wang, H. Rong, F. Zhou, L. Xie, Z. Guo, Y. Yang, L. Yu, T. Chen, G. Hu & X. W. Chu. “Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes.” NeurIPS 2018 Workshop on Systems for ML and Open Source Software (2018). [Link]
  • Zhao, H., H. Liu, Y.-W. Leung & X. W. Chu. “Self-Adaptive Collective Motion of Swarm Robots.” IEEE Transactions on Automation Science and Engineering 15.4 (2018): 1533-1545. [Link]
  • Liu, C., Q. Wang, X.-W. Chu & Y. W. Leung. “G-CRS: GPU Accelerated Cauchy Reed-Solomon Coding.” IEEE Transactions on Parallel and Distributed Systems 29.7 (2018): 1482-1498. [Link]
  • Zhang, F., H. Liu, Y. W. Leung, X.-W. Chu & B. Jin. “CBS: Community-based Bus System as Routing Backbone for Vehicular Ad Hoc Networks.” IEEE Transactions on Mobile Computing 16.8 (2017): 2132-2146. [Link]
  • Wang, Q., P. Xu, Y. Zhang & X. W. Chu. “EPPMiner: An Extended Benchmark Suite for Energy, Power and Performance Characterization of Heterogeneous Architecture.” The 8th ACM International Conference on Future Energy Systems (e-Energy) (2017). [Link]
  • Chau, V., X. W. Chu, H. Liu & Y.-W. Leung. “Energy Efficient Job Scheduling with DVFS for CPU-GPU Heterogeneous Systems.” The 8th ACM International Conference on Future Energy Systems (e-Energy) (2017). [Link]
  • Mei, X., X. W. Chu, H. Liu, Y.-W. Leung & Z. Li. “Energy Efficient Real-time Task Scheduling on CPU-GPU Hybrid Clusters.” IEEE INFOCOM (2017). [Link]
  • Mei, X. & X. W. Chu. "Dissecting GPU Memory Hierarchy through Microbenchmarking." IEEE Transactions on Parallel and Distributed Systems 28.1 (2017): 72-86. [Link]
  • Lam, Albert Y. S., Y. W. Leung & X. W. Chu. “Autonomous Vehicle Public Transportation System: Scheduling and Admission Control.” IEEE Transactions on Intelligent Transportation Systems 17.5 (2016): 1210-1226. [Link]
  • Yu, L., H. Liu, Y. W. Leung, X.-W. Chu & Z. Lin. “Multiple Radios for Fast Rendezvous in Cognitive Radio Networks.” IEEE Transactions on Mobile Computing 14.9 (2015): 1917-1931. [Link]
  • Zhang, F., H. Liu, Y. W. Leung, X.-W. Chu & B. Jin. “Community-based Bus System as Routing Backbone for Vehicular Ad Hoc Networks.” IEEE ICDCS (2015). [Link]
  • Zhao, J., X. W. Chu, H. Liu, Y. W. Leung & Z. Li. “Online Procurement Auctions for Resource Pooling in Client-Assisted Cloud Storage Systems.” IEEE INFOCOM (2015). [Link]
  • Lam, Albert Y. S., Y. W. Leung & X. W. Chu. “Electric Vehicle Charging Station Placement: Formulation, Complexity, and Solutions.” IEEE Transactions on Smart Grid 5.6 (2014): 2846–2856. [Link]
  • Zhao K. & X. W. Chu. “G-BLASTN: Accelerating Nucleotide Alignment by Graphics Processors.” Bioinformatics 30.10 (2014): 1384-91. [Link]
  • Liu, H., X. W. Chu, Y. W. Leung & R. Du. “Minimum-Cost Sensor Placement for Required Lifetime in Wireless Sensor-Target Surveillance Networks.” IEEE Transactions on Parallel and Distributed Systems 24.9 (2013): 1783-1796. [Link]
  • Lin, Z., H. Liu, X. W. Chu & Y. W. Leung. “Enhanced Jump-Stay Rendezvous Algorithm for Cognitive Radio Networks,” IEEE Communications Letters 17.9 (2013): 1742-1745. [Link]
  • Lin, Z., H. Liu, X. W. Chu, Y. W. Leung & I. Stojmenovic. “Constructing Connected-Dominating-Set with Maximum Lifetime in Cognitive Radio Networks.” IEEE Transactions on Computers (2013). [Link]
  • Liu, H., Z. Lin, X. W. Chu & Y. W. Leung. “Jump-Stay Rendezvous Algorithm for Cognitive Radio Networks.” IEEE Transactions on Parallel and Distributed Systems 23.10 (2012): 1867-1881. [Link]
  • Li, Z. & X.-W. Chu. “On Achieving Group-Strategyproof Multicast.” IEEE Transactions on Parallel and Distributed Systems 23.5 (2012): 913-923. [Link]
  • Liu, C. M., T. Wong, E. Wu, R. Luo, S. M. Yiu, Y. Li, B. Wang, C. Yu, X. W. Chu, K. Zhao, R. Li & T. W. Lam. “SOAP3: Ultra-fast GPU-based parallel alignment tool for short reads.” Bioinformatics 28.6 (2012):878-879. [Link]
  • Liu, H., X. W. Chu, Y. W. Leung, X. Jia & P. Wan. “General Maximal Lifetime Sensor-Target Surveillance Problem and Its Solution.” IEEE Transactions on Parallel and Distributed Systems 22.10 (2011): 1757-1765. [Link]
  • Lin, Z., H. Liu, X. W. Chu & Y. W. Leung. “Jump-Stay Based Channel-Hopping Algorithm with Guaranteed Rendezvous for Cognitive Radio Networks.” IEEE INFOCOM (2011). [Link]
  • Liu, H., X. W. Chu, Y. W. Leung & R. Du. “Simple Movement Control Algorithm for Bi-Connectivity in Robotic Sensor Networks.” IEEE Journal on Selected Areas in Communications 28.7 (2010): 994-1005. [Link]
  • Chu, X. W. & Y. Jiang. “Random Linear Network Coding for Peer-to-Peer Applications.” IEEE Network 24.4 (2010): 35-39. [Link]
  • Chu, X. W., K. Zhao, Z. Li & A. Mahanti. “Auction-Based On-Demand P2P Min-Cost Media Streaming with Network Coding.” IEEE Transactions on Parallel and Distributed Systems 20.12 (2009): 1816-1829. [Link]
  • Li, X. Y., Y. Wu, H. Chen, X. W. Chu, Y. Wu & Y. Qi. “Reliable and Energy Efficient Routing for Static Wireless Ad Hoc Networks with Unreliable Links.” IEEE Transactions on Parallel and Distributed Systems 20.10 (2009): 1408-1421. [Link]
  • Liu, J. C., J. Xu & X. W. Chu. “Fine-Grained Scalable Video Caching for Heterogeneous Clients.” IEEE Transactions on Multimedia 8.5 (2006): 1011-1020. [Link]
  • Chu, X. W. & B. Li. “Dynamic Routing and Wavelength Assignment in the Presence of Wavelength Conversion for All-Optical Networks.” IEEE/ACM Transactions on Networking 12.3 (2005): 704-715. [Link]
  • Chu, X. W., J. Liu, B. Li & Z. Zhang. “Analytical Model of Sparse-Partial Wavelength Conversion in Wavelength-Routed WDM Networks.” IEEE Communications Letters 9.1 (2005): 69-71. [Link]
  • Chu, X. W., J. Liu & Z. Zhang. “Analysis of Sparse-Partial Wavelength Conversion in Wavelength-Routed WDM Networks.” IEEE INFOCOM (2004). [Link]
  • Liu, J., X. W. Chu & J. Xu. “Proxy Cache Management for Fine-Grained Scalable Video Streaming.” IEEE INFOCOM (2004). [Link]
  • Sohraby, K., Z. Zhang, X. W. Chu & B. Li. “Resource Management in an Integrated Optical Network.” IEEE Journal on Selected Areas in Communications 21.7 (2003): 1052-1062. [Link]
  • Li, B., X. W. Chu & K. Sohraby. “Routing and Wavelength Assignment vs. Wavelength Converter Placement in All-Optical Networks.” IEEE Communications Magazine 41.8 (2003): S22-S28. [Lhttps://ieeexplore.ieee.org/document/1222717ink]
  • Chu, X. W., B. Li & I. Chlamtac. “Wavelength Converter Placement under Different RWA Algorithms in Wavelength-Routed All-Optical Networks.” IEEE Transactions on Communications 51.4 (2003): 607-617. [Link]
  • Chu, X. W., B. Li & Z. Zhang. “A Dynamic RWA Algorithm in a Wavelength-Routed All-Optical Network with Wavelength Converters.” IEEE INFOCOM (2003): 1795-1804. [Link]