BACKProf. Xiaowen Chu, Dr. Qiang Wang, and PhD student Yuxin Wang received the Best Paper Award for their co-authored paper “Energy-Efficient Inference Service of Transformer-Based Deep Learning Models on GPUs“ at the 16th IEEE International Conference on Green Computing and Communications (GreenCom 2020).
The award-winning paper addressed how to improve the energy efficiency of Inference-as-a-service (IAAS), especially the language translation service based on the Transformer Sequence Transduction model, without violating the service-level agreement (SLA) in the cloud environment. Although Transformer has achieved state-of-the-art performance in many natural language processing tasks, it consumes a significant amount of energy due to the large model size and tremendous computations. The team conducted a comprehensive study on the inference performance and energy efficiency of a Transformer model trained for the language translation service. Their findings provide a full scope of Transformer inference, and suggest that the workload balancing and scheduling have great potential to offer energy-efficient Transformer inference services.
The 5-day Conference on Green Computing and Communications (GreenCom) is an international forum for scientists, engineers, and researchers to exchange their novel research regarding advancements in state-of-art of green computing and communications, as well as to identify the emerging research topics and open issues for further research.