Paper “Pipelined Query Processing in Coprocessor Environments” accepted at SIGMOD 2018
Henning Funke (TU Dortmund University), Sebastian Breß (German Research Center for Artificial Intelligence, TU Berlin), Stefan Noll (TU Dortmund University), Volker Markl (TU Berlin, German Research Center for Artificial Intelligence), and Jens Teubner (TU Dortmund University)
Query processing on GPU-style coprocessors is severely limited by the movement of data. With teraflops of compute throughput in one device, even high-bandwidth memory cannot provision enough data for a reasonable utilization. Query compilation is a proven technique to improve memory efficiency. However, its inherent tuple-at-a-time processing style does not suit the massively parallel execution model of GPU-style coprocessors. This compromises the improvements in efficiency offered by query compilation. In this paper, we show how query compilation and GPU-style parallelism can be made to play in unison nevertheless. We describe a compiler strategy that merges multiple operations into a single GPU kernel, thereby significantly reducing bandwidth demand. Compared to operator-at-a-time, we show reductions of memory access volumes by factors of up to 7.5x resulting in shorter kernel execution times by factors of up to 9.5x.